linux

mirror of https://github.com/torvalds/linux.git synced 2026-06-07 14:04:54 +02:00

Author	SHA1	Message	Date
Greg Kroah-Hartman	b9a61f9a56	Merge 5.10.27 into android12-5.10 Changes in 5.10.27 mm/memcg: rename mem_cgroup_split_huge_fixup to split_page_memcg and add nr_pages argument mm/memcg: set memcg when splitting page mt76: fix tx skb error handling in mt76_dma_tx_queue_skb net: stmmac: fix dma physical address of descriptor when display ring net: fec: ptp: avoid register access when ipg clock is disabled powerpc/4xx: Fix build errors from mfdcr() atm: eni: dont release is never initialized atm: lanai: dont run lanai_dev_close if not open Revert "r8152: adjust the settings about MAC clock speed down for RTL8153" ALSA: hda: ignore invalid NHLT table ixgbe: Fix memleak in ixgbe_configure_clsu32 scsi: ufs: ufs-qcom: Disable interrupt in reset path blk-cgroup: Fix the recursive blkg rwstat net: tehuti: fix error return code in bdx_probe() net: intel: iavf: fix error return code of iavf_init_get_resources() sun/niu: fix wrong RXMAC_BC_FRM_CNT_COUNT count gianfar: fix jumbo packets+napi+rx overrun crash cifs: ask for more credit on async read/write code paths gfs2: fix use-after-free in trans_drain cpufreq: blacklist Arm Vexpress platforms in cpufreq-dt-platdev gpiolib: acpi: Add missing IRQF_ONESHOT nfs: fix PNFS_FLEXFILE_LAYOUT Kconfig default NFS: Correct size calculation for create reply length net: hisilicon: hns: fix error return code of hns_nic_clear_all_rx_fetch() net: wan: fix error return code of uhdlc_init() net: davicom: Use platform_get_irq_optional() net: enetc: set MAC RX FIFO to recommended value atm: uPD98402: fix incorrect allocation atm: idt77252: fix null-ptr-dereference cifs: change noisy error message to FYI irqchip/ingenic: Add support for the JZ4760 kbuild: add image_name to no-sync-config-targets kbuild: dummy-tools: fix inverted tests for gcc umem: fix error return code in mm_pci_probe() sparc64: Fix opcode filtering in handling of no fault loads habanalabs: Call put_pid() when releasing control device staging: rtl8192e: fix kconfig dependency on CRYPTO u64_stats,lockdep: Fix u64_stats_init() vs lockdep kselftest: arm64: Fix exit code of sve-ptrace regulator: qcom-rpmh: Correct the pmic5_hfsmps515 buck block: Fix REQ_OP_ZONE_RESET_ALL handling drm/amd/display: Revert dram_clock_change_latency for DCN2.1 drm/amdgpu: fb BO should be ttm_bo_type_device drm/radeon: fix AGP dependency nvme: simplify error logic in nvme_validate_ns() nvme: add NVME_REQ_CANCELLED flag in nvme_cancel_request() nvme-fc: set NVME_REQ_CANCELLED in nvme_fc_terminate_exchange() nvme-fc: return NVME_SC_HOST_ABORTED_CMD when a command has been aborted nvme-core: check ctrl css before setting up zns nvme-rdma: Fix a use after free in nvmet_rdma_write_data_done nvme-pci: add the DISABLE_WRITE_ZEROES quirk for a Samsung PM1725a nfs: we don't support removing system.nfs4_acl block: Suppress uevent for hidden device when removed mm/fork: clear PASID for new mm ia64: fix ia64_syscall_get_set_arguments() for break-based syscalls ia64: fix ptrace(PTRACE_SYSCALL_INFO_EXIT) sign static_call: Pull some static_call declarations to the type headers static_call: Allow module use without exposing static_call_key static_call: Fix the module key fixup static_call: Fix static_call_set_init() KVM: x86: Protect userspace MSR filter with SRCU, and set atomically-ish btrfs: fix sleep while in non-sleep context during qgroup removal selinux: don't log MAC_POLICY_LOAD record on failed policy load selinux: fix variable scope issue in live sidtab conversion netsec: restore phy power state after controller reset platform/x86: intel-vbtn: Stop reporting SW_DOCK events psample: Fix user API breakage z3fold: prevent reclaim/free race for headless pages squashfs: fix inode lookup sanity checks squashfs: fix xattr id and id lookup sanity checks hugetlb_cgroup: fix imbalanced css_get and css_put pair for shared mappings kasan: fix per-page tags for non-page_alloc pages gcov: fix clang-11+ support ACPI: video: Add missing callback back for Sony VPCEH3U1E ACPICA: Always create namespace nodes using acpi_ns_create_node() arm64: stacktrace: don't trace arch_stack_walk() arm64: dts: ls1046a: mark crypto engine dma coherent arm64: dts: ls1012a: mark crypto engine dma coherent arm64: dts: ls1043a: mark crypto engine dma coherent ARM: dts: at91: sam9x60: fix mux-mask for PA7 so it can be set to A, B and C ARM: dts: at91: sam9x60: fix mux-mask to match product's datasheet ARM: dts: at91-sama5d27_som1: fix phy address to 7 integrity: double check iint_cache was initialized drm/etnaviv: Use FOLL_FORCE for userptr drm/amd/pm: workaround for audio noise issue drm/amdgpu/display: restore AUX_DPHY_TX_CONTROL for DCN2.x drm/amdgpu: Add additional Sienna Cichlid PCI ID drm/i915: Fix the GT fence revocation runtime PM logic dm verity: fix DM_VERITY_OPTS_MAX value dm ioctl: fix out of bounds array access when no devices bus: omap_l3_noc: mark l3 irqs as IRQF_NO_THREAD ARM: OMAP2+: Fix smartreflex init regression after dropping legacy data soc: ti: omap-prm: Fix occasional abort on reset deassert for dra7 iva veth: Store queue_mapping independently of XDP prog presence bpf: Change inode_storage's lookup_elem return value from NULL to -EBADF libbpf: Fix INSTALL flag order net/mlx5e: RX, Mind the MPWQE gaps when calculating offsets net/mlx5e: When changing XDP program without reset, take refs for XSK RQs net/mlx5e: Don't match on Geneve options in case option masks are all zero ipv6: fix suspecious RCU usage warning drop_monitor: Perform cleanup upon probe registration failure macvlan: macvlan_count_rx() needs to be aware of preemption net: sched: validate stab values net: dsa: bcm_sf2: Qualify phydev->dev_flags based on port igc: reinit_locked() should be called with rtnl_lock igc: Fix Pause Frame Advertising igc: Fix Supported Pause Frame Link Setting igc: Fix igc_ptp_rx_pktstamp() e1000e: add rtnl_lock() to e1000_reset_task e1000e: Fix error handling in e1000_set_d0_lplu_state_82571 net/qlcnic: Fix a use after free in qlcnic_83xx_get_minidump_template net: phy: broadcom: Add power down exit reset state delay ftgmac100: Restart MAC HW once clk: qcom: gcc-sc7180: Use floor ops for the correct sdcc1 clk net: ipa: terminate message handler arrays net: qrtr: fix a kernel-infoleak in qrtr_recvmsg() flow_dissector: fix byteorder of dissected ICMP ID selftests/bpf: Set gopt opt_class to 0 if get tunnel opt failed netfilter: ctnetlink: fix dump of the expect mask attribute net: hdlc_x25: Prevent racing between "x25_close" and "x25_xmit"/"x25_rx" net: phylink: Fix phylink_err() function name error in phylink_major_config tipc: better validate user input in tipc_nl_retrieve_key() tcp: relookup sock for RST+ACK packets handled by obsolete req sock can: isotp: isotp_setsockopt(): only allow to set low level TX flags for CAN-FD can: isotp: TX-path: ensure that CAN frame flags are initialized can: peak_usb: add forgotten supported devices can: flexcan: flexcan_chip_freeze(): fix chip freeze for missing bitrate can: kvaser_pciefd: Always disable bus load reporting can: c_can_pci: c_can_pci_remove(): fix use-after-free can: c_can: move runtime PM enable/disable to c_can_platform can: m_can: m_can_do_rx_poll(): fix extraneous msg loss warning can: m_can: m_can_rx_peripheral(): fix RX being blocked by errors mac80211: fix rate mask reset mac80211: Allow HE operation to be longer than expected. selftests/net: fix warnings on reuseaddr_ports_exhausted nfp: flower: fix unsupported pre_tunnel flows nfp: flower: add ipv6 bit to pre_tunnel control message nfp: flower: fix pre_tun mask id allocation ftrace: Fix modify_ftrace_direct. drm/msm/dsi: fix check-before-set in the 7nm dsi_pll code ionic: linearize tso skb with too many frags net/sched: cls_flower: fix only mask bit check in the validate_ct_state netfilter: nftables: report EOPNOTSUPP on unsupported flowtable flags netfilter: nftables: allow to update flowtable flags netfilter: flowtable: Make sure GC works periodically in idle system libbpf: Fix error path in bpf_object__elf_init() libbpf: Use SOCK_CLOEXEC when opening the netlink socket ARM: dts: imx6ull: fix ubi filesystem mount failed ipv6: weaken the v4mapped source check octeontx2-af: Formatting debugfs entry rsrc_alloc. octeontx2-af: Modify default KEX profile to extract TX packet fields octeontx2-af: Remove TOS field from MKEX TX octeontx2-af: Fix irq free in rvu teardown octeontx2-pf: Clear RSS enable flag on interace down octeontx2-af: fix infinite loop in unmapping NPC counter net: check all name nodes in __dev_alloc_name net: cdc-phonet: fix data-interface release on probe failure igb: check timestamp validity r8152: limit the RX buffer size of RTL8153A for USB 2.0 net: stmmac: dwmac-sun8i: Provide TX and RX fifo sizes selinux: vsock: Set SID for socket returned by accept() selftests: forwarding: vxlan_bridge_1d: Fix vxlan ecn decapsulate value libbpf: Fix BTF dump of pointer-to-array-of-struct bpf: Fix umd memory leak in copy_process() can: isotp: tx-path: zero initialize outgoing CAN frames drm/msm: fix shutdown hook in case GPU components failed to bind drm/msm: Fix suspend/resume on i.MX5 arm64: kdump: update ppos when reading elfcorehdr PM: runtime: Defer suspending suppliers net/mlx5: Add back multicast stats for uplink representor net/mlx5e: Allow to match on MPLS parameters only for MPLS over UDP net/mlx5e: Offload tuple rewrite for non-CT flows net/mlx5e: Fix error path for ethtool set-priv-flag PM: EM: postpone creating the debugfs dir till fs_initcall net: bridge: don't notify switchdev for local FDB addresses octeontx2-af: Fix memory leak of object buf xen/x86: make XEN_BALLOON_MEMORY_HOTPLUG_LIMIT depend on MEMORY_HOTPLUG RDMA/cxgb4: Fix adapter LE hash errors while destroying ipv6 listening server bpf: Don't do bpf_cgroup_storage_set() for kuprobe/tp programs net: Consolidate common blackhole dst ops net, bpf: Fix ip6ip6 crash with collect_md populated skbs igb: avoid premature Rx buffer reuse net: axienet: Properly handle PCS/PMA PHY for 1000BaseX mode net: axienet: Fix probe error cleanup net: phy: introduce phydev->port net: phy: broadcom: Avoid forward for bcm54xx_config_clock_delay() net: phy: broadcom: Set proper 1000BaseX/SGMII interface mode for BCM54616S net: phy: broadcom: Fix RGMII delays for BCM50160 and BCM50610M Revert "netfilter: x_tables: Switch synchronization to RCU" netfilter: x_tables: Use correct memory barriers. dm table: Fix zoned model check and zone sectors check mm/mmu_notifiers: ensure range_end() is paired with range_start() Revert "netfilter: x_tables: Update remaining dereference to RCU" ACPI: scan: Rearrange memory allocation in acpi_device_add() ACPI: scan: Use unique number for instance_no perf auxtrace: Fix auxtrace queue conflict perf synthetic events: Avoid write of uninitialized memory when generating PERF_RECORD_MMAP* records io_uring: fix provide_buffers sign extension block: recalculate segment count for multi-segment discards correctly scsi: Revert "qla2xxx: Make sure that aborted commands are freed" scsi: qedi: Fix error return code of qedi_alloc_global_queues() scsi: mpt3sas: Fix error return code of mpt3sas_base_attach() smb3: fix cached file size problems in duplicate extents (reflink) cifs: Adjust key sizes and key generation routines for AES256 encryption locking/mutex: Fix non debug version of mutex_lock_io_nested() x86/mem_encrypt: Correct physical address calculation in __set_clr_pte_enc() mm/memcg: fix 5.10 backport of splitting page memcg fs/cachefiles: Remove wait_bit_key layout dependency ch_ktls: fix enum-conversion warning can: dev: Move device back to init netns on owning netns delete r8169: fix DMA being used after buffer free if WoL is enabled net: dsa: b53: VLAN filtering is global to all users mac80211: fix double free in ibss_leave ext4: add reclaim checks to xattr code fs/ext4: fix integer overflow in s_log_groups_per_flex Revert "xen: fix p2m size in dom0 for disabled memory hotplug case" Revert "net: bonding: fix error return code of bond_neigh_init()" nvme: fix the nsid value to print in nvme_validate_or_alloc_ns can: peak_usb: Revert "can: peak_usb: add forgotten supported devices" xen-blkback: don't leak persistent grants from xen_blkbk_map() Linux 5.10.27 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> Change-Id: I7eafe976fd6bf33db6db4adb8ebf2ff087294a23	2021-04-02 15:25:50 +02:00
Mukesh Ojha	1c4893edfe	FROMGIT: pstore: Add mem_type property DT parsing support There could be a scenario where we define some region in normal memory and use them store to logs which is later retrieved by bootloader during warm reset. In this scenario, we wanted to treat this memory as normal cacheable memory instead of default behaviour which is an overhead. Making it cacheable could improve performance. This commit gives control to change mem_type from Device tree, and also documents the value for normal memory. Bug: 179108912 Signed-off-by: Mukesh Ojha <mojha@codeaurora.org> (cherry picked from commit `9d843e8faf` git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux.git for-next/pstore) Signed-off-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/1616438537-13719-1-git-send-email-mojha@codeaurora.org Change-Id: I56ae3c5dba729962854a5d590d8c80cc3aae12bd Signed-off-by: Mukesh Ojha <mojha@codeaurora.org>	2021-04-02 10:56:28 +00:00
Paul Lawrence	16ce7f9c5e	ANDROID: Incremental fs: Truncate file when complete Bug: 182185202 Test: incfs_test passes Signed-off-by: Paul Lawrence <paullawrence@google.com> Change-Id: I96a192011f19efa1c597275dafc6c216f8ed0b56	2021-04-01 11:10:22 -07:00
Paul Lawrence	38d8cfc0bd	ANDROID: Incremental fs: Fix mlock to fail gracefully on corrupt files Test: incfs_test passes Bug: 174875107 Signed-off-by: Paul Lawrence <paullawrence@google.com> Change-Id: I93ce3600e88ddd89cf69f032ea858d169b0a7bec	2021-04-01 11:10:22 -07:00
Paul Lawrence	2a8c6b0f30	ANDROID: Incremental fs: Finer readlog compression internally Bug: 182196484 Test: incfs_test passes Signed-off-by: Paul Lawrence <paullawrence@google.com> Change-Id: Icad395115ad81cc267046f7a41b41046077bb78b	2021-04-01 11:10:22 -07:00
Paul Lawrence	5c023f6fd1	ANDROID: Incremental fs: Support STATX_ATTR_VERITY Bug: 181242243 Test: incfs_test passes Signed-off-by: Paul Lawrence <paullawrence@google.com> Change-Id: Id996e0d5d95c8b42254d1e1e0c1dad9317183a17	2021-04-01 11:10:22 -07:00
Sabyrzhan Tasbolatov	df61d3cff4	fs/ext4: fix integer overflow in s_log_groups_per_flex commit `f91436d55a` upstream. syzbot found UBSAN: shift-out-of-bounds in ext4_mb_init [1], when 1 << sbi->s_es->s_log_groups_per_flex is bigger than UINT_MAX, where sbi->s_mb_prefetch is unsigned integer type. 32 is the maximum allowed power of s_log_groups_per_flex. Following if check will also trigger UBSAN shift-out-of-bound: if (1 << sbi->s_es->s_log_groups_per_flex >= UINT_MAX) { So I'm checking it against the raw number, perhaps there is another way to calculate UINT_MAX max power. Also use min_t as to make sure it's uint type. [1] UBSAN: shift-out-of-bounds in fs/ext4/mballoc.c:2713:24 shift exponent 60 is too large for 32-bit type 'int' Call Trace: __dump_stack lib/dump_stack.c:79 [inline] dump_stack+0x137/0x1be lib/dump_stack.c:120 ubsan_epilogue lib/ubsan.c:148 [inline] __ubsan_handle_shift_out_of_bounds+0x432/0x4d0 lib/ubsan.c:395 ext4_mb_init_backend fs/ext4/mballoc.c:2713 [inline] ext4_mb_init+0x19bc/0x19f0 fs/ext4/mballoc.c:2898 ext4_fill_super+0xc2ec/0xfbe0 fs/ext4/super.c:4983 Reported-by: syzbot+a8b4b0c60155e87e9484@syzkaller.appspotmail.com Signed-off-by: Sabyrzhan Tasbolatov <snovitoll@gmail.com> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20210224095800.3350002-1-snovitoll@gmail.com Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-03-30 14:32:08 +02:00
Jan Kara	0229b5926d	ext4: add reclaim checks to xattr code commit `163f0ec1df` upstream. Syzbot is reporting that ext4 can enter fs reclaim from kvmalloc() while the transaction is started like: fs_reclaim_acquire+0x117/0x150 mm/page_alloc.c:4340 might_alloc include/linux/sched/mm.h:193 [inline] slab_pre_alloc_hook mm/slab.h:493 [inline] slab_alloc_node mm/slub.c:2817 [inline] __kmalloc_node+0x5f/0x430 mm/slub.c:4015 kmalloc_node include/linux/slab.h:575 [inline] kvmalloc_node+0x61/0xf0 mm/util.c:587 kvmalloc include/linux/mm.h:781 [inline] ext4_xattr_inode_cache_find fs/ext4/xattr.c:1465 [inline] ext4_xattr_inode_lookup_create fs/ext4/xattr.c:1508 [inline] ext4_xattr_set_entry+0x1ce6/0x3780 fs/ext4/xattr.c:1649 ext4_xattr_ibody_set+0x78/0x2b0 fs/ext4/xattr.c:2224 ext4_xattr_set_handle+0x8f4/0x13e0 fs/ext4/xattr.c:2380 ext4_xattr_set+0x13a/0x340 fs/ext4/xattr.c:2493 This should be impossible since transaction start sets PF_MEMALLOC_NOFS. Add some assertions to the code to catch if something isn't working as expected early. Link: https://lore.kernel.org/linux-ext4/000000000000563a0205bafb7970@google.com/ Signed-off-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20210222171626.21884-1-jack@suse.cz Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-03-30 14:32:08 +02:00
Matthew Wilcox (Oracle)	6f15c02ebb	fs/cachefiles: Remove wait_bit_key layout dependency commit `39f985c8f6` upstream. Cachefiles was relying on wait_page_key and wait_bit_key being the same layout, which is fragile. Now that wait_page_key is exposed in the pagemap.h header, we can remove that fragility A comment on the need to maintain structure layout equivalence was added by Linus[1] and that is no longer applicable. Fixes: `6290602709` ("mm: add PageWaiters indicating tasks are waiting for a page bit") Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David Howells <dhowells@redhat.com> Tested-by: kafs-testing@auristor.com cc: linux-cachefs@redhat.com cc: linux-mm@kvack.org Link: https://lore.kernel.org/r/20210320054104.1300774-2-willy@infradead.org/ Link: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=3510ca20ece0150af6b10c77a74ff1b5c198e3e2 [1] Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-03-30 14:32:07 +02:00
Shyam Prasad N	d4ce2a8f46	cifs: Adjust key sizes and key generation routines for AES256 encryption commit `45a4546c61` upstream. For AES256 encryption (GCM and CCM), we need to adjust the size of a few fields to 32 bytes instead of 16 to accommodate the larger keys. Also, the L value supplied to the key generator needs to be changed from to 256 when these algorithms are used. Keeping the ioctl struct for dumping keys of the same size for now. Will send out a different patch for that one. Signed-off-by: Shyam Prasad N <sprasad@microsoft.com> Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com> CC: <stable@vger.kernel.org> # v5.10+ Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-03-30 14:32:07 +02:00
Steve French	86cc799e1d	smb3: fix cached file size problems in duplicate extents (reflink) commit `cfc63fc812` upstream. There were two problems (one of which could cause data corruption) that were noticed with duplicate extents (ie reflink) when debugging why various xfstests were being incorrectly skipped (e.g. generic/138, generic/140, generic/142). First, we were not updating the file size locally in the cache when extending a file due to reflink (it would refresh after actimeo expires) but xfstest was checking the size immediately which was still 0 so caused the test to be skipped. Second, we were setting the target file size (which could shrink the file) in all cases to the end of the reflinked range rather than only setting the target file size when reflink would extend the file. CC: <stable@vger.kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-03-30 14:32:07 +02:00
Pavel Begunkov	dcf2dfc161	io_uring: fix provide_buffers sign extension [ Upstream commit `d81269fecb` ] io_provide_buffers_prep()'s "p->len * p->nbufs" to sign extension problems. Not a huge problem as it's only used for access_ok() and increases the checked length, but better to keep typing right. Reported-by: Colin Ian King <colin.king@canonical.com> Fixes: `efe68c1ca8` ("io_uring: validate the full range of provided buffers for access") Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Reviewed-by: Colin Ian King <colin.king@canonical.com> Link: https://lore.kernel.org/r/562376a39509e260d8532186a06226e56eb1f594.1616149233.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-03-30 14:32:06 +02:00
Phillip Lougher	269042e8ff	squashfs: fix xattr id and id lookup sanity checks commit `8b44ca2b63` upstream. The checks for maximum metadata block size is missing SQUASHFS_BLOCK_OFFSET (the two byte length count). Link: https://lkml.kernel.org/r/2069685113.2081245.1614583677427@webmail.123-reg.co.uk Fixes: `f37aa4c736` ("squashfs: add more sanity checks in id lookup") Signed-off-by: Phillip Lougher <phillip@squashfs.org.uk> Cc: Sean Nyekjaer <sean@geanix.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-03-30 14:31:54 +02:00
Sean Nyekjaer	61d72c5952	squashfs: fix inode lookup sanity checks commit `c1b2028315` upstream. When mouting a squashfs image created without inode compression it fails with: "unable to read inode lookup table" It turns out that the BLOCK_OFFSET is missing when checking the SQUASHFS_METADATA_SIZE agaist the actual size. Link: https://lkml.kernel.org/r/20210226092903.1473545-1-sean@geanix.com Fixes: `eabac19e40` ("squashfs: add more sanity checks in inode lookup") Signed-off-by: Sean Nyekjaer <sean@geanix.com> Acked-by: Phillip Lougher <phillip@squashfs.org.uk> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-03-30 14:31:54 +02:00
Filipe Manana	3b87d0c583	btrfs: fix sleep while in non-sleep context during qgroup removal commit `0bb7883009` upstream. While removing a qgroup's sysfs entry we end up taking the kernfs_mutex, through kobject_del(), while holding the fs_info->qgroup_lock spinlock, producing the following trace: [821.843637] BUG: sleeping function called from invalid context at kernel/locking/mutex.c:281 [821.843641] in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 28214, name: podman [821.843644] CPU: 3 PID: 28214 Comm: podman Tainted: G W 5.11.6 #15 [821.843646] Hardware name: Dell Inc. PowerEdge R330/084XW4, BIOS 2.11.0 12/08/2020 [821.843647] Call Trace: [821.843650] dump_stack+0xa1/0xfb [821.843656] ___might_sleep+0x144/0x160 [821.843659] mutex_lock+0x17/0x40 [821.843662] kernfs_remove_by_name_ns+0x1f/0x80 [821.843666] sysfs_remove_group+0x7d/0xe0 [821.843668] sysfs_remove_groups+0x28/0x40 [821.843670] kobject_del+0x2a/0x80 [821.843672] btrfs_sysfs_del_one_qgroup+0x2b/0x40 [btrfs] [821.843685] __del_qgroup_rb+0x12/0x150 [btrfs] [821.843696] btrfs_remove_qgroup+0x288/0x2a0 [btrfs] [821.843707] btrfs_ioctl+0x3129/0x36a0 [btrfs] [821.843717] ? __mod_lruvec_page_state+0x5e/0xb0 [821.843719] ? page_add_new_anon_rmap+0xbc/0x150 [821.843723] ? kfree+0x1b4/0x300 [821.843725] ? mntput_no_expire+0x55/0x330 [821.843728] __x64_sys_ioctl+0x5a/0xa0 [821.843731] do_syscall_64+0x33/0x70 [821.843733] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [821.843736] RIP: 0033:0x4cd3fb [821.843741] RSP: 002b:000000c000906b20 EFLAGS: 00000206 ORIG_RAX: 0000000000000010 [821.843744] RAX: ffffffffffffffda RBX: 000000c000050000 RCX: 00000000004cd3fb [821.843745] RDX: 000000c000906b98 RSI: 000000004010942a RDI: 000000000000000f [821.843747] RBP: 000000c000907cd0 R08: 000000c000622901 R09: 0000000000000000 [821.843748] R10: 000000c000d992c0 R11: 0000000000000206 R12: 000000000000012d [821.843749] R13: 000000000000012c R14: 0000000000000200 R15: 0000000000000049 Fix this by removing the qgroup sysfs entry while not holding the spinlock, since the spinlock is only meant for protection of the qgroup rbtree. Reported-by: Stuart Shelton <srcshelton@gmail.com> Link: https://lore.kernel.org/linux-btrfs/7A5485BB-0628-419D-A4D3-27B1AF47E25A@gmail.com/ Fixes: `49e5fb4621` ("btrfs: qgroup: export qgroups in sysfs") CC: stable@vger.kernel.org # 5.10+ Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-03-30 14:31:53 +02:00
J. Bruce Fields	9f70460801	nfs: we don't support removing system.nfs4_acl [ Upstream commit `4f8be1f53b` ] The NFSv4 protocol doesn't have any notion of reomoving an attribute, so removexattr(path,"system.nfs4_acl") doesn't make sense. There's no documented return value. Arguably it could be EOPNOTSUPP but I'm a little worried an application might take that to mean that we don't support ACLs or xattrs. How about EINVAL? Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-03-30 14:31:52 +02:00
Paulo Alcantara	b684c380f0	cifs: change noisy error message to FYI [ Upstream commit `e3d100eae4` ] A customer has reported that their dmesg were being flooded by CIFS: VFS: \\server Cancelling wait for mid xxx cmd: a CIFS: VFS: \\server Cancelling wait for mid yyy cmd: b CIFS: VFS: \\server Cancelling wait for mid zzz cmd: c because some processes that were performing statfs(2) on the share had been interrupted due to their automount setup when certain users logged in and out. Change it to FYI as they should be mostly informative rather than error messages. Signed-off-by: Paulo Alcantara (SUSE) <pc@cjr.nz> Reviewed-by: Aurelien Aptel <aaptel@suse.com> Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-03-30 14:31:50 +02:00
Frank Sorenson	9d1a5392ac	NFS: Correct size calculation for create reply length [ Upstream commit `ad3dbe35c8` ] CREATE requests return a post_op_fh3, rather than nfs_fh3. The post_op_fh3 includes an extra word to indicate 'handle_follows'. Without that additional word, create fails when full 64-byte filehandles are in use. Add NFS3_post_op_fh_sz, and correct the size calculation for NFS3_createres_sz. Signed-off-by: Frank Sorenson <sorenson@redhat.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-03-30 14:31:49 +02:00
Timo Rothenpieler	2479c6b9ef	nfs: fix PNFS_FLEXFILE_LAYOUT Kconfig default [ Upstream commit `a0590473c5` ] This follows what was done in `8c2fabc654`. With the default being m, it's impossible to build the module into the kernel. Signed-off-by: Timo Rothenpieler <timo@rothenpieler.org> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-03-30 14:31:49 +02:00
Bob Peterson	6d7dce3bdf	gfs2: fix use-after-free in trans_drain [ Upstream commit `1a5a2cfd34` ] This patch adds code to function trans_drain to remove drained bd elements from the ail lists, if queued, before freeing the bd. If we don't remove the bd from the ail, function ail_drain will try to reference the bd after it has been freed by trans_drain. Thanks to Andy Price for his analysis of the problem. Reported-by: Andy Price <anprice@redhat.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-03-30 14:31:49 +02:00
Aurelien Aptel	419ebba40d	cifs: ask for more credit on async read/write code paths [ Upstream commit `88fd98a230` ] When doing a large read or write workload we only very gradually increase the number of credits which can cause problems with parallelizing large i/o (I/O ramps up more slowly than it should for large read/write workloads) especially with multichannel when the number of credits on the secondary channels starts out low (e.g. less than about 130) or when recovering after server throttled back the number of credit. Signed-off-by: Aurelien Aptel <aaptel@suse.com> Reviewed-by: Shyam Prasad N <sprasad@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-03-30 14:31:49 +02:00
Jaegeuk Kim	42e7f89d79	Revert "ANDROID: Revert "f2fs: fix to tag FIEMAP_EXTENT_MERGED in f2fs_fiemap()"" This reverts commit `3520187422`. Reason for revert: need to reapply after 3/26 Change-Id: I57a17fdd39c6eb1eb372a4b031f1935b6661cb62 Signed-off-by: Todd Kjos <tkjos@google.com>	2021-03-26 15:56:48 +00:00
Greg Kroah-Hartman	99941e23f7	Merge branch 'android12-5.10-lts' into 'android12-5.10' Updates the branch to the 5.10.26 upstream kernel version. Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> Change-Id: I84aa29bf4e4e809051eb346830c4c4b5acb78c8c	2021-03-26 08:25:46 +01:00
Greg Kroah-Hartman	d5992d56cc	ANDROID: fix up ext4 build from 5.10.26 In commit `b7ff91fd03` ("ext4: find old entry again if failed to rename whiteout") a new call to ext4_find_entry() was made, but in commit `705a3e5b18` ("ANDROID: ext4: Handle casefolding with encryption") only in the ANDROID tree, a new parameter is added to that function. Add NULL there to keep the build working, hopefully one-day the out-of-tree patch will get merged upstream... Fixes: `705a3e5b18` ("ANDROID: ext4: Handle casefolding with encryption") Fixes: `b7ff91fd03` ("ext4: find old entry again if failed to rename whiteout") Cc: Daniel Rosenberg <drosen@google.com> Cc: Paul Lawrence <paullawrence@google.com> Bug: 138322712 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>BB Change-Id: I69b7f9c12d1f9016b8269e5bc7878469700b6477	2021-03-25 17:16:14 +01:00
Greg Kroah-Hartman	57b60a3a15	Merge 5.10.26 into android12-5.10-lts Changes in 5.10.26 ASoC: ak4458: Add MODULE_DEVICE_TABLE ASoC: ak5558: Add MODULE_DEVICE_TABLE spi: cadence: set cqspi to the driver_data field of struct device ALSA: dice: fix null pointer dereference when node is disconnected ALSA: hda/realtek: apply pin quirk for XiaomiNotebook Pro ALSA: hda: generic: Fix the micmute led init state ALSA: hda/realtek: Apply headset-mic quirks for Xiaomi Redmibook Air ALSA: hda/realtek: fix mute/micmute LEDs for HP 840 G8 ALSA: hda/realtek: fix mute/micmute LEDs for HP 440 G8 ALSA: hda/realtek: fix mute/micmute LEDs for HP 850 G8 Revert "PM: runtime: Update device status before letting suppliers suspend" s390/vtime: fix increased steal time accounting s390/pci: refactor zpci_create_device() s390/pci: remove superfluous zdev->zbus check s390/pci: fix leak of PCI device structure zonefs: Fix O_APPEND async write handling zonefs: prevent use of seq files as swap file zonefs: fix to update .i_wr_refcnt correctly in zonefs_open_zone() btrfs: fix race when cloning extent buffer during rewind of an old root btrfs: fix slab cache flags for free space tree bitmap vhost-vdpa: fix use-after-free of v->config_ctx vhost-vdpa: set v->config_ctx to NULL if eventfd_ctx_fdget() fails drm/amd/display: Correct algorithm for reversed gamma ASoC: fsl_ssi: Fix TDM slot setup for I2S mode ASoC: Intel: bytcr_rt5640: Fix HP Pavilion x2 10-p0XX OVCD current threshold ASoC: SOF: Intel: unregister DMIC device on probe error ASoC: SOF: intel: fix wrong poll bits in dsp power down ASoC: qcom: sdm845: Fix array out of bounds access ASoC: qcom: sdm845: Fix array out of range on rx slim channels ASoC: codecs: wcd934x: add a sanity check in set channel map ASoC: qcom: lpass-cpu: Fix lpass dai ids parse ASoC: simple-card-utils: Do not handle device clock afs: Fix accessing YFS xattrs on a non-YFS server afs: Stop listxattr() from listing "afs." attributes ALSA: usb-audio: Fix unintentional sign extension issue nvme: fix Write Zeroes limitations nvme-tcp: fix misuse of __smp_processor_id with preemption enabled nvme-tcp: fix possible hang when failing to set io queues nvme-tcp: fix a NULL deref when receiving a 0-length r2t PDU nvmet: don't check iosqes,iocqes for discovery controllers nfsd: Don't keep looking up unhashed files in the nfsd file cache nfsd: don't abort copies early NFSD: Repair misuse of sv_lock in 5.10.16-rt30. NFSD: fix dest to src mount in inter-server COPY svcrdma: disable timeouts on rdma backchannel vfio: IOMMU_API should be selected vhost_vdpa: fix the missing irq_bypass_unregister_producer() invocation sunrpc: fix refcount leak for rpc auth modules i915/perf: Start hrtimer only if sampling the OA buffer pstore: Fix warning in pstore_kill_sb() io_uring: ensure that SQPOLL thread is started for exit net/qrtr: fix __netdev_alloc_skb call kbuild: Fix <linux/version.h> for empty SUBLEVEL or PATCHLEVEL again cifs: fix allocation size on newly created files riscv: Correct SPARSEMEM configuration scsi: lpfc: Fix some error codes in debugfs scsi: myrs: Fix a double free in myrs_cleanup() scsi: ufs: ufs-mediatek: Correct operator & -> && RISC-V: correct enum sbi_ext_rfence_fid counter: stm32-timer-cnt: Report count function when SLAVE_MODE_DISABLED gpiolib: Assign fwnode to parent's if no primary one provided nvme-rdma: fix possible hang when failing to set io queues ibmvnic: add some debugs ibmvnic: serialize access to work queue on remove tty: serial: stm32-usart: Remove set but unused 'cookie' variables serial: stm32: fix DMA initialization error handling bpf: Declare __bpf_free_used_maps() unconditionally RDMA/rtrs: Remove unnecessary argument dir of rtrs_iu_free RDMA/rtrs-srv: Jump to dereg_mr label if allocate iu fails RDMA/rtrs: Introduce rtrs_post_send RDMA/rtrs: Fix KASAN: stack-out-of-bounds bug module: merge repetitive strings in module_sig_check() module: avoid goto*s in module_sig_check() module: harden ELF info handling scsi: pm80xx: Make mpi_build_cmd locking consistent scsi: pm80xx: Make running_req atomic scsi: pm80xx: Fix pm8001_mpi_get_nvmd_resp() race condition scsi: pm8001: Neaten debug logging macros and uses scsi: libsas: Remove notifier indirection scsi: libsas: Introduce a _gfp() variant of event notifiers scsi: mvsas: Pass gfp_t flags to libsas event notifiers scsi: isci: Pass gfp_t flags in isci_port_link_down() scsi: isci: Pass gfp_t flags in isci_port_link_up() scsi: isci: Pass gfp_t flags in isci_port_bc_change_received() RDMA/mlx5: Allow creating all QPs even when non RDMA profile is used powerpc/sstep: Fix load-store and update emulation powerpc/sstep: Fix darn emulation i40e: Fix endianness conversions net: phy: micrel: set soft_reset callback to genphy_soft_reset for KSZ8081 MIPS: compressed: fix build with enabled UBSAN drm/amd/display: turn DPMS off on connector unplug iwlwifi: Add a new card for MA family io_uring: fix inconsistent lock state media: cedrus: h264: Support profile controls ibmvnic: remove excessive irqsave s390/qeth: schedule TX NAPI on QAOB completion drm/amd/pm: fulfill the Polaris implementation for get_clock_by_type_with_latency() io_uring: don't attempt IO reissue from the ring exit path io_uring: clear IOCB_WAITQ for non -EIOCBQUEUED return net: bonding: fix error return code of bond_neigh_init() regulator: pca9450: Add SD_VSEL GPIO for LDO5 regulator: pca9450: Enable system reset on WDOG_B assertion regulator: pca9450: Clear PRESET_EN bit to fix BUCK1/2/3 voltage setting gfs2: Add common helper for holding and releasing the freeze glock gfs2: move freeze glock outside the make_fs_rw and _ro functions gfs2: bypass signal_our_withdraw if no journal powerpc: Force inlining of cpu_has_feature() to avoid build failure usb-storage: Add quirk to defeat Kindle's automatic unload usbip: Fix incorrect double assignment to udc->ud.tcp_rx usb: gadget: configfs: Fix KASAN use-after-free usb: typec: Remove vdo[3] part of tps6598x_rx_identity_reg struct usb: typec: tcpm: Invoke power_supply_changed for tcpm-source-psy- usb: dwc3: gadget: Allow runtime suspend if UDC unbinded usb: dwc3: gadget: Prevent EP queuing while stopping transfers thunderbolt: Initialize HopID IDAs in tb_switch_alloc() thunderbolt: Increase runtime PM reference count on DP tunnel discovery iio:adc:stm32-adc: Add HAS_IOMEM dependency iio:adc:qcom-spmi-vadc: add default scale to LR_MUX2_BAT_ID channel iio: adis16400: Fix an error code in adis16400_initial_setup() iio: gyro: mpu3050: Fix error handling in mpu3050_trigger_handler iio: adc: ab8500-gpadc: Fix off by 10 to 3 iio: adc: ad7949: fix wrong ADC result due to incorrect bit mask iio: adc: adi-axi-adc: add proper Kconfig dependencies iio: hid-sensor-humidity: Fix alignment issue of timestamp channel iio: hid-sensor-prox: Fix scale not correct issue iio: hid-sensor-temperature: Fix issues of timestamp channel counter: stm32-timer-cnt: fix ceiling write max value counter: stm32-timer-cnt: fix ceiling miss-alignment with reload register PCI: rpadlpar: Fix potential drc_name corruption in store functions perf/x86/intel: Fix a crash caused by zero PEBS status perf/x86/intel: Fix unchecked MSR access error caused by VLBR_EVENT x86/ioapic: Ignore IRQ2 again kernel, fs: Introduce and use set_restart_fn() and arch_set_restart_data() x86: Move TS_COMPAT back to asm/thread_info.h x86: Introduce TS_COMPAT_RESTART to fix get_nr_restart_syscall() efivars: respect EFI_UNSUPPORTED return from firmware ext4: fix error handling in ext4_end_enable_verity() ext4: find old entry again if failed to rename whiteout ext4: stop inode update before return ext4: do not try to set xattr into ea_inode if value is empty ext4: fix potential error in ext4_do_update_inode ext4: fix rename whiteout with fast commit MAINTAINERS: move some real subsystems off of the staging mailing list MAINTAINERS: move the staging subsystem to lists.linux.dev static_call: Fix static_call_update() sanity check efi: use 32-bit alignment for efi_guid_t literals firmware/efi: Fix a use after bug in efi_mem_reserve_persistent genirq: Disable interrupts for force threaded handlers x86/apic/of: Fix CPU devicetree-node lookups cifs: Fix preauth hash corruption Linux 5.10.26 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> Change-Id: I6f6bdd1dc46dc744c848e778f9edd0be558b46ac	2021-03-25 17:15:27 +01:00
Vincent Whitchurch	de1126ea44	cifs: Fix preauth hash corruption commit `05946d4b7a` upstream. smb311_update_preauth_hash() uses the shash in server->secmech without appropriate locking, and this can lead to sessions corrupting each other's preauth hashes. The following script can easily trigger the problem: #!/bin/sh -e NMOUNTS=10 for i in $(seq $NMOUNTS); mkdir -p /tmp/mnt$i umount /tmp/mnt$i 2>/dev/null \|\| : done while :; do for i in $(seq $NMOUNTS); do mount -t cifs //192.168.0.1/test /tmp/mnt$i -o ... & done wait for i in $(seq $NMOUNTS); do umount /tmp/mnt$i done done Usually within seconds this leads to one or more of the mounts failing with the following errors, and a "Bad SMB2 signature for message" is seen in the server logs: CIFS: VFS: \\192.168.0.1 failed to connect to IPC (rc=-13) CIFS: VFS: cifs_mount failed w/return code = -13 Fix it by holding the server mutex just like in the other places where the shashes are used. Fixes: `8bd68c6e47` ("CIFS: implement v3.11 preauth integrity") Signed-off-by: Vincent Whitchurch <vincent.whitchurch@axis.com> CC: <stable@vger.kernel.org> Reviewed-by: Aurelien Aptel <aaptel@suse.com> Signed-off-by: Steve French <stfrench@microsoft.com> [aaptel: backport to kernel without CIFS_SESS_OP] Signed-off-by: Aurelien Aptel <aaptel@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-03-25 09:04:18 +01:00
Harshad Shirwadkar	35ecf664fd	ext4: fix rename whiteout with fast commit commit `8210bb29c1` upstream. This patch adds rename whiteout support in fast commits. Note that the whiteout object that gets created is actually char device. Which imples, the function ext4_inode_journal_mode(struct inode *inode) would return "JOURNAL_DATA" for this inode. This has a consequence in fast commit code that it will make creation of the whiteout object a fast-commit ineligible behavior and thus will fall back to full commits. With this patch, this can be observed by running fast commits with rename whiteout and seeing the stats generated by ext4_fc_stats tracepoint as follows: ext4_fc_stats: dev 254:32 fc ineligible reasons: XATTR:0, CROSS_RENAME:0, JOURNAL_FLAG_CHANGE:0, NO_MEM:0, SWAP_BOOT:0, RESIZE:0, RENAME_DIR:0, FALLOC_RANGE:0, INODE_JOURNAL_DATA:16; num_commits:6, ineligible: 6, numblks: 3 So in short, this patch guarantees that in case of rename whiteout, we fall back to full commits. Amir mentioned that instead of creating a new whiteout object for every rename, we can create a static whiteout object with irrelevant nlink. That will make fast commits to not fall back to full commit. But until this happens, this patch will ensure correctness by falling back to full commits. Fixes: `8016e29f43` ("ext4: fast commit recovery path") Cc: stable@kernel.org Signed-off-by: Harshad Shirwadkar <harshadshirwadkar@gmail.com> Link: https://lore.kernel.org/r/20210316221921.1124955-1-harshadshirwadkar@gmail.com Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-03-25 09:04:17 +01:00
Shijie Luo	e8fa569465	ext4: fix potential error in ext4_do_update_inode commit `7d8bd3c76d` upstream. If set_large_file = 1 and errors occur in ext4_handle_dirty_metadata(), the error code will be overridden, go to out_brelse to avoid this situation. Signed-off-by: Shijie Luo <luoshijie1@huawei.com> Link: https://lore.kernel.org/r/20210312065051.36314-1-luoshijie1@huawei.com Cc: stable@kernel.org Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-03-25 09:04:17 +01:00
zhangyi (F)	6163a0662b	ext4: do not try to set xattr into ea_inode if value is empty commit `6b22489911` upstream. Syzbot report a warning that ext4 may create an empty ea_inode if set an empty extent attribute to a file on the file system which is no free blocks left. WARNING: CPU: 6 PID: 10667 at fs/ext4/xattr.c:1640 ext4_xattr_set_entry+0x10f8/0x1114 fs/ext4/xattr.c:1640 ... Call trace: ext4_xattr_set_entry+0x10f8/0x1114 fs/ext4/xattr.c:1640 ext4_xattr_block_set+0x1d0/0x1b1c fs/ext4/xattr.c:1942 ext4_xattr_set_handle+0x8a0/0xf1c fs/ext4/xattr.c:2390 ext4_xattr_set+0x120/0x1f0 fs/ext4/xattr.c:2491 ext4_xattr_trusted_set+0x48/0x5c fs/ext4/xattr_trusted.c:37 __vfs_setxattr+0x208/0x23c fs/xattr.c:177 ... Now, ext4 try to store extent attribute into an external inode if ext4_xattr_block_set() return -ENOSPC, but for the case of store an empty extent attribute, store the extent entry into the extent attribute block is enough. A simple reproduce below. fallocate test.img -l 1M mkfs.ext4 -F -b 2048 -O ea_inode test.img mount test.img /mnt dd if=/dev/zero of=/mnt/foo bs=2048 count=500 setfattr -n "user.test" /mnt/foo Reported-by: syzbot+98b881fdd8ebf45ab4ae@syzkaller.appspotmail.com Fixes: `9c6e7853c5` ("ext4: reserve space for xattr entries/names") Cc: stable@kernel.org Signed-off-by: zhangyi (F) <yi.zhang@huawei.com> Link: https://lore.kernel.org/r/20210305120508.298465-1-yi.zhang@huawei.com Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-03-25 09:04:17 +01:00
Pan Bian	d130b802f9	ext4: stop inode update before return commit `512c15ef05` upstream. The inode update should be stopped before returing the error code. Signed-off-by: Pan Bian <bianpan2016@163.com> Link: https://lore.kernel.org/r/20210117085732.93788-1-bianpan2016@163.com Fixes: `8016e29f43` ("ext4: fast commit recovery path") Cc: stable@kernel.org Reviewed-by: Harshad Shirwadkar <harshadshirwadkar@gmail.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-03-25 09:04:17 +01:00
zhangyi (F)	258db8e6ff	ext4: find old entry again if failed to rename whiteout commit `b7ff91fd03` upstream. If we failed to add new entry on rename whiteout, we cannot reset the old->de entry directly, because the old->de could have moved from under us during make indexed dir. So find the old entry again before reset is needed, otherwise it may corrupt the filesystem as below. /dev/sda: Entry '00000001' in ??? (12) has deleted/unused inode 15. CLEARED. /dev/sda: Unattached inode 75 /dev/sda: UNEXPECTED INCONSISTENCY; RUN fsck MANUALLY. Fixes: `6b4b8e6b4a` ("ext4: fix bug for rename with RENAME_WHITEOUT") Cc: stable@vger.kernel.org Signed-off-by: zhangyi (F) <yi.zhang@huawei.com> Link: https://lore.kernel.org/r/20210303131703.330415-1-yi.zhang@huawei.com Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-03-25 09:04:17 +01:00
Eric Biggers	9689ecadf8	ext4: fix error handling in ext4_end_enable_verity() commit `f053cf7aa6` upstream. ext4 didn't properly clean up if verity failed to be enabled on a file: - It left verity metadata (pages past EOF) in the page cache, which would be exposed to userspace if the file was later extended. - It didn't truncate the verity metadata at all (either from cache or from disk) if an error occurred while setting the verity bit. Fix these bugs by adding a call to truncate_inode_pages() and ensuring that we truncate the verity metadata (both from cache and from disk) in all error paths. Also rework the code to cleanly separate the success path from the error paths, which makes it much easier to understand. Reported-by: Yunlei He <heyunlei@hihonor.com> Fixes: `c93d8f8858` ("ext4: add basic fs-verity support") Cc: stable@vger.kernel.org # v5.4+ Signed-off-by: Eric Biggers <ebiggers@google.com> Link: https://lore.kernel.org/r/20210302200420.137977-2-ebiggers@kernel.org Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-03-25 09:04:17 +01:00
Oleg Nesterov	4523e648b7	kernel, fs: Introduce and use set_restart_fn() and arch_set_restart_data() commit `5abbe51a52` upstream. Preparation for fixing get_nr_restart_syscall() on X86 for COMPAT. Add a new helper which sets restart_block->fn and calls a dummy arch_set_restart_data() helper. Fixes: `609c19a385` ("x86/ptrace: Stop setting TS_COMPAT in ptrace code") Signed-off-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20210201174641.GA17871@redhat.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-03-25 09:04:16 +01:00
Bob Peterson	2bdef2b476	gfs2: bypass signal_our_withdraw if no journal [ Upstream commit `d5bf630f35` ] Before this patch, function signal_our_withdraw referenced the journal inode immediately. But corrupt file systems may have some invalid journals, in which case our attempt to read it in will withdraw and the resulting signal_our_withdraw would dereference the NULL value. This patch adds a check to signal_our_withdraw so that if the journal has not yet been initialized, it simply returns and does the old-style withdraw. Thanks, Andy Price, for his analysis. Reported-by: syzbot+50a8a9cf8127f2c6f5df@syzkaller.appspotmail.com Fixes: `601ef0d52e` ("gfs2: Force withdraw to replay journals and wait for it to finish") Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-03-25 09:04:14 +01:00
Bob Peterson	a602e830dd	gfs2: move freeze glock outside the make_fs_rw and _ro functions [ Upstream commit `96b1454f2e` ] Before this patch, sister functions gfs2_make_fs_rw and gfs2_make_fs_ro locked (held) the freeze glock by calling gfs2_freeze_lock and gfs2_freeze_unlock. The problem is, not all the callers of gfs2_make_fs_ro should be doing this. The three callers of gfs2_make_fs_ro are: remount (gfs2_reconfigure), signal_our_withdraw, and unmount (gfs2_put_super). But when unmounting the file system we can get into the following circular lock dependency: deactivate_super down_write(&s->s_umount); <-------------------------------------- s_umount deactivate_locked_super gfs2_kill_sb kill_block_super generic_shutdown_super gfs2_put_super gfs2_make_fs_ro gfs2_glock_nq_init sd_freeze_gl freeze_go_sync if (freeze glock in SH) freeze_super (vfs) down_write(&sb->s_umount); <------- s_umount This patch moves the hold of the freeze glock outside the two sister rw/ro functions to their callers, but it doesn't request the glock from gfs2_put_super, thus eliminating the circular dependency. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-03-25 09:04:14 +01:00
Bob Peterson	49787b1bba	gfs2: Add common helper for holding and releasing the freeze glock [ Upstream commit `c77b52c0a1` ] Many places in the gfs2 code queued and dequeued the freeze glock. Almost all of them acquire it in SHARED mode, and need to specify the same LM_FLAG_NOEXP and GL_EXACT flags. This patch adds common helper functions gfs2_freeze_lock and gfs2_freeze_unlock to make the code more readable, and to prepare for the next patch. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-03-25 09:04:14 +01:00
Jens Axboe	76f496681d	io_uring: clear IOCB_WAITQ for non -EIOCBQUEUED return [ Upstream commit `b5b0ecb736` ] The callback can only be armed, if we get -EIOCBQUEUED returned. It's important that we clear the WAITQ bit for other cases, otherwise we can queue for async retry and filemap will assume that we're armed and return -EAGAIN instead of just blocking for the IO. Cc: stable@vger.kernel.org # 5.9+ Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-03-25 09:04:13 +01:00
Jens Axboe	3c08f772ad	io_uring: don't attempt IO reissue from the ring exit path [ Upstream commit `7c977a58dc` ] If we're exiting the ring, just let the IO fail with -EAGAIN as nobody will care anyway. It's not the right context to reissue from. Cc: stable@vger.kernel.org Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-03-25 09:04:13 +01:00
Pavel Begunkov	1c20e9040f	io_uring: fix inconsistent lock state [ Upstream commit `9ae1f8dd37` ] WARNING: inconsistent lock state inconsistent {HARDIRQ-ON-W} -> {IN-HARDIRQ-W} usage. syz-executor217/8450 [HC1[1]:SC0[0]:HE0:SE1] takes: ffff888023d6e620 (&fs->lock){?.+.}-{2:2}, at: spin_lock include/linux/spinlock.h:354 [inline] ffff888023d6e620 (&fs->lock){?.+.}-{2:2}, at: io_req_clean_work fs/io_uring.c:1398 [inline] ffff888023d6e620 (&fs->lock){?.+.}-{2:2}, at: io_dismantle_req+0x66f/0xf60 fs/io_uring.c:2029 other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(&fs->lock); <Interrupt> lock(&fs->lock); * DEADLOCK * 1 lock held by syz-executor217/8450: #0: ffff88802417c3e8 (&ctx->uring_lock){+.+.}-{3:3}, at: __do_sys_io_uring_enter+0x1071/0x1f30 fs/io_uring.c:9442 stack backtrace: CPU: 1 PID: 8450 Comm: syz-executor217 Not tainted 5.11.0-rc5-next-20210129-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: <IRQ> [...] _raw_spin_lock+0x2a/0x40 kernel/locking/spinlock.c:151 spin_lock include/linux/spinlock.h:354 [inline] io_req_clean_work fs/io_uring.c:1398 [inline] io_dismantle_req+0x66f/0xf60 fs/io_uring.c:2029 __io_free_req+0x3d/0x2e0 fs/io_uring.c:2046 io_free_req fs/io_uring.c:2269 [inline] io_double_put_req fs/io_uring.c:2392 [inline] io_put_req+0xf9/0x570 fs/io_uring.c:2388 io_link_timeout_fn+0x30c/0x480 fs/io_uring.c:6497 __run_hrtimer kernel/time/hrtimer.c:1519 [inline] __hrtimer_run_queues+0x609/0xe40 kernel/time/hrtimer.c:1583 hrtimer_interrupt+0x334/0x940 kernel/time/hrtimer.c:1645 local_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1085 [inline] __sysvec_apic_timer_interrupt+0x146/0x540 arch/x86/kernel/apic/apic.c:1102 asm_call_irq_on_stack+0xf/0x20 </IRQ> __run_sysvec_on_irqstack arch/x86/include/asm/irq_stack.h:37 [inline] run_sysvec_on_irqstack_cond arch/x86/include/asm/irq_stack.h:89 [inline] sysvec_apic_timer_interrupt+0xbd/0x100 arch/x86/kernel/apic/apic.c:1096 asm_sysvec_apic_timer_interrupt+0x12/0x20 arch/x86/include/asm/idtentry.h:629 RIP: 0010:__raw_spin_unlock_irq include/linux/spinlock_api_smp.h:169 [inline] RIP: 0010:_raw_spin_unlock_irq+0x25/0x40 kernel/locking/spinlock.c:199 spin_unlock_irq include/linux/spinlock.h:404 [inline] io_queue_linked_timeout+0x194/0x1f0 fs/io_uring.c:6525 __io_queue_sqe+0x328/0x1290 fs/io_uring.c:6594 io_queue_sqe+0x631/0x10d0 fs/io_uring.c:6639 io_queue_link_head fs/io_uring.c:6650 [inline] io_submit_sqe fs/io_uring.c:6697 [inline] io_submit_sqes+0x19b5/0x2720 fs/io_uring.c:6960 __do_sys_io_uring_enter+0x107d/0x1f30 fs/io_uring.c:9443 do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Don't free requests from under hrtimer context (softirq) as it may sleep or take spinlocks improperly (e.g. non-irq versions). Cc: stable@vger.kernel.org # 5.6+ Reported-by: syzbot+81d17233a2b02eafba33@syzkaller.appspotmail.com Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-03-25 09:04:13 +01:00
Steve French	04eb2b2fa1	cifs: fix allocation size on newly created files commit `65af8f0166` upstream. Applications that create and extend and write to a file do not expect to see 0 allocation size. When file is extended, set its allocation size to a plausible value until we have a chance to query the server for it. When the file is cached this will prevent showing an impossible number of allocated blocks (like 0). This fixes e.g. xfstests 614 which does 1) create a file and set its size to 64K 2) mmap write 64K to the file 3) stat -c %b for the file (to query the number of allocated blocks) It was failing because we returned 0 blocks. Even though we would return the correct cached file size, we returned an impossible allocation size. Signed-off-by: Steve French <stfrench@microsoft.com> CC: <stable@vger.kernel.org> Reviewed-by: Aurelien Aptel <aaptel@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-03-25 09:04:09 +01:00
Jens Axboe	6cae809549	io_uring: ensure that SQPOLL thread is started for exit commit `3ebba796fa` upstream. If we create it in a disabled state because IORING_SETUP_R_DISABLED is set on ring creation, we need to ensure that we've kicked the thread if we're exiting before it's been explicitly disabled. Otherwise we can run into a deadlock where exit is waiting go park the SQPOLL thread, but the SQPOLL thread itself is waiting to get a signal to start. That results in the below trace of both tasks hung, waiting on each other: INFO: task syz-executor458:8401 blocked for more than 143 seconds. Not tainted 5.11.0-next-20210226-syzkaller #0 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. task:syz-executor458 state:D stack:27536 pid: 8401 ppid: 8400 flags:0x00004004 Call Trace: context_switch kernel/sched/core.c:4324 [inline] __schedule+0x90c/0x21a0 kernel/sched/core.c:5075 schedule+0xcf/0x270 kernel/sched/core.c:5154 schedule_timeout+0x1db/0x250 kernel/time/timer.c:1868 do_wait_for_common kernel/sched/completion.c:85 [inline] __wait_for_common kernel/sched/completion.c:106 [inline] wait_for_common kernel/sched/completion.c:117 [inline] wait_for_completion+0x168/0x270 kernel/sched/completion.c:138 io_sq_thread_park fs/io_uring.c:7115 [inline] io_sq_thread_park+0xd5/0x130 fs/io_uring.c:7103 io_uring_cancel_task_requests+0x24c/0xd90 fs/io_uring.c:8745 __io_uring_files_cancel+0x110/0x230 fs/io_uring.c:8840 io_uring_files_cancel include/linux/io_uring.h:47 [inline] do_exit+0x299/0x2a60 kernel/exit.c:780 do_group_exit+0x125/0x310 kernel/exit.c:922 __do_sys_exit_group kernel/exit.c:933 [inline] __se_sys_exit_group kernel/exit.c:931 [inline] __x64_sys_exit_group+0x3a/0x50 kernel/exit.c:931 do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46 entry_SYSCALL_64_after_hwframe+0x44/0xae RIP: 0033:0x43e899 RSP: 002b:00007ffe89376d48 EFLAGS: 00000246 ORIG_RAX: 00000000000000e7 RAX: ffffffffffffffda RBX: 00000000004af2f0 RCX: 000000000043e899 RDX: 000000000000003c RSI: 00000000000000e7 RDI: 0000000000000000 RBP: 0000000000000000 R08: ffffffffffffffc0 R09: 0000000010000000 R10: 0000000000008011 R11: 0000000000000246 R12: 00000000004af2f0 R13: 0000000000000001 R14: 0000000000000000 R15: 0000000000000001 INFO: task iou-sqp-8401:8402 can't die for more than 143 seconds. task:iou-sqp-8401 state:D stack:30272 pid: 8402 ppid: 8400 flags:0x00004004 Call Trace: context_switch kernel/sched/core.c:4324 [inline] __schedule+0x90c/0x21a0 kernel/sched/core.c:5075 schedule+0xcf/0x270 kernel/sched/core.c:5154 schedule_timeout+0x1db/0x250 kernel/time/timer.c:1868 do_wait_for_common kernel/sched/completion.c:85 [inline] __wait_for_common kernel/sched/completion.c:106 [inline] wait_for_common kernel/sched/completion.c:117 [inline] wait_for_completion+0x168/0x270 kernel/sched/completion.c:138 io_sq_thread+0x27d/0x1ae0 fs/io_uring.c:6717 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:294 INFO: task iou-sqp-8401:8402 blocked for more than 143 seconds. Reported-by: syzbot+fb5458330b4442f2090d@syzkaller.appspotmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-03-25 09:04:09 +01:00
Tetsuo Handa	a7acb61428	pstore: Fix warning in pstore_kill_sb() commit `9c7d83ae6b` upstream. syzbot is hitting WARN_ON(pstore_sb != sb) at pstore_kill_sb() [1], for the assumption that pstore_sb != NULL is wrong because pstore_fill_super() will not assign pstore_sb = sb when new_inode() for d_make_root() returned NULL (due to memory allocation fault injection). Since mount_single() calls pstore_kill_sb() when pstore_fill_super() failed, pstore_kill_sb() needs to be aware of such failure path. [1] https://syzkaller.appspot.com/bug?id=6abacb8da5137cb47a416f2bef95719ed60508a0 Reported-by: syzbot <syzbot+d0cf0ad6513e9a1da5df@syzkaller.appspotmail.com> Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20210214031307.57903-1-penguin-kernel@I-love.SAKURA.ne.jp Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-03-25 09:04:08 +01:00
Olga Kornievskaia	982b899ba6	NFSD: fix dest to src mount in inter-server COPY commit `614c975017` upstream. A cleanup of the inter SSC copy needs to call fput() of the source file handle to make sure that file structure is freed as well as drop the reference on the superblock to unmount the source server. Fixes: `36e1e5ba90` ("NFSD: Fix use-after-free warning when doing inter-server copy") Signed-off-by: Olga Kornievskaia <kolga@netapp.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Tested-by: Dai Ngo <dai.ngo@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-03-25 09:04:08 +01:00
J. Bruce Fields	12628e7779	nfsd: don't abort copies early commit `bfdd89f232` upstream. The typical result of the backwards comparison here is that the source server in a server-to-server copy will return BAD_STATEID within a few seconds of the copy starting, instead of giving the copy a full lease period, so the copy_file_range() call will end up unnecessarily returning a short read. Fixes: `624322f1ad` "NFSD add COPY_NOTIFY operation" Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-03-25 09:04:08 +01:00
Trond Myklebust	5ea0aa29ad	nfsd: Don't keep looking up unhashed files in the nfsd file cache commit `d30881f573` upstream. If a file is unhashed, then we're going to reject it anyway and retry, so make sure we skip it when we're doing the RCU lockless lookup. This avoids a number of unnecessary nfserr_jukebox returns from nfsd_file_acquire() Fixes: `65294c1f2c` ("nfsd: add a new struct file caching facility to nfsd") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-03-25 09:04:08 +01:00
David Howells	64195f022a	afs: Stop listxattr() from listing "afs." attributes commit `a7889c6320` upstream. afs_listxattr() lists all the available special afs xattrs (i.e. those in the "afs." space), no matter what type of server we're dealing with. But OpenAFS servers, for example, cannot deal with some of the extra-capable attributes that AuriStor (YFS) servers provide. Unfortunately, the presence of the afs.yfs.* attributes causes errors[1] for anything that tries to read them if the server is of the wrong type. Fix the problem by removing afs_listxattr() so that none of the special xattrs are listed (AFS doesn't support xattrs). It does mean, however, that getfattr won't list them, though they can still be accessed with getxattr() and setxattr(). This can be tested with something like: getfattr -d -m "." /afs/example.com/path/to/file With this change, none of the afs. attributes should be visible. Changes: ver #2: - Hide all of the afs.* xattrs, not just the ACL ones. Fixes: `ae46578b96` ("afs: Get YFS ACLs and information through xattrs") Reported-by: Gaja Sophie Peters <gaja.peters@math.uni-hamburg.de> Signed-off-by: David Howells <dhowells@redhat.com> Tested-by: Gaja Sophie Peters <gaja.peters@math.uni-hamburg.de> Reviewed-by: Jeffrey Altman <jaltman@auristor.com> Reviewed-by: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org Link: http://lists.infradead.org/pipermail/linux-afs/2021-March/003502.html [1] Link: http://lists.infradead.org/pipermail/linux-afs/2021-March/003567.html # v1 Link: http://lists.infradead.org/pipermail/linux-afs/2021-March/003573.html # v2 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-03-25 09:04:07 +01:00
David Howells	78ba4793b0	afs: Fix accessing YFS xattrs on a non-YFS server commit `64fcbb6158` upstream. If someone attempts to access YFS-related xattrs (e.g. afs.yfs.acl) on a file on a non-YFS AFS server (such as OpenAFS), then the kernel will jump to a NULL function pointer because the afs_fetch_acl_operation descriptor doesn't point to a function for issuing an operation on a non-YFS server[1]. Fix this by making afs_wait_for_operation() check that the issue_afs_rpc method is set before jumping to it and setting -ENOTSUPP if not. This fix also covers other potential operations that also only exist on YFS servers. afs_xattr_get/set_yfs() then need to translate -ENOTSUPP to -ENODATA as the former error is internal to the kernel. The bug shows up as an oops like the following: BUG: kernel NULL pointer dereference, address: 0000000000000000 [...] Code: Unable to access opcode bytes at RIP 0xffffffffffffffd6. [...] Call Trace: afs_wait_for_operation+0x83/0x1b0 [kafs] afs_xattr_get_yfs+0xe6/0x270 [kafs] __vfs_getxattr+0x59/0x80 vfs_getxattr+0x11c/0x140 getxattr+0x181/0x250 ? __check_object_size+0x13f/0x150 ? __fput+0x16d/0x250 __x64_sys_fgetxattr+0x64/0xb0 do_syscall_64+0x49/0xc0 entry_SYSCALL_64_after_hwframe+0x44/0xa9 RIP: 0033:0x7fb120a9defe This was triggered with "cp -a" which attempts to copy xattrs, including afs ones, but is easier to reproduce with getfattr, e.g.: getfattr -d -m ".*" /afs/openafs.org/ Fixes: `e49c7b2f6d` ("afs: Build an abstraction around an "operation" concept") Reported-by: Gaja Sophie Peters <gaja.peters@math.uni-hamburg.de> Signed-off-by: David Howells <dhowells@redhat.com> Tested-by: Gaja Sophie Peters <gaja.peters@math.uni-hamburg.de> Reviewed-by: Marc Dionne <marc.dionne@auristor.com> Reviewed-by: Jeffrey Altman <jaltman@auristor.com> cc: linux-afs@lists.infradead.org Link: http://lists.infradead.org/pipermail/linux-afs/2021-March/003498.html [1] Link: http://lists.infradead.org/pipermail/linux-afs/2021-March/003566.html # v1 Link: http://lists.infradead.org/pipermail/linux-afs/2021-March/003572.html # v2 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-03-25 09:04:07 +01:00
David Sterba	2c8d6a9474	btrfs: fix slab cache flags for free space tree bitmap commit `34e49994d0` upstream. The free space tree bitmap slab cache is created with SLAB_RED_ZONE but that's a debugging flag and not always enabled. Also the other slabs are created with at least SLAB_MEM_SPREAD that we want as well to average the memory placement cost. Reported-by: Vlastimil Babka <vbabka@suse.cz> Fixes: `3acd48507d` ("btrfs: fix allocation of free space cache v1 bitmap pages") CC: stable@vger.kernel.org # 5.4+ Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-03-25 09:04:06 +01:00
Filipe Manana	38ffe9eaeb	btrfs: fix race when cloning extent buffer during rewind of an old root commit `dbcc7d57bf` upstream. While resolving backreferences, as part of a logical ino ioctl call or fiemap, we can end up hitting a BUG_ON() when replaying tree mod log operations of a root, triggering a stack trace like the following: ------------[ cut here ]------------ kernel BUG at fs/btrfs/ctree.c:1210! invalid opcode: 0000 [#1] SMP KASAN PTI CPU: 1 PID: 19054 Comm: crawl_335 Tainted: G W 5.11.0-2d11c0084b02-misc-next+ #89 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014 RIP: 0010:__tree_mod_log_rewind+0x3b1/0x3c0 Code: 05 48 8d 74 10 (...) RSP: 0018:ffffc90001eb70b8 EFLAGS: 00010297 RAX: 0000000000000000 RBX: ffff88812344e400 RCX: ffffffffb28933b6 RDX: 0000000000000007 RSI: dffffc0000000000 RDI: ffff88812344e42c RBP: ffffc90001eb7108 R08: 1ffff11020b60a20 R09: ffffed1020b60a20 R10: ffff888105b050f9 R11: ffffed1020b60a1f R12: 00000000000000ee R13: ffff8880195520c0 R14: ffff8881bc958500 R15: ffff88812344e42c FS: 00007fd1955e8700(0000) GS:ffff8881f5600000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007efdb7928718 CR3: 000000010103a006 CR4: 0000000000170ee0 Call Trace: btrfs_search_old_slot+0x265/0x10d0 ? lock_acquired+0xbb/0x600 ? btrfs_search_slot+0x1090/0x1090 ? free_extent_buffer.part.61+0xd7/0x140 ? free_extent_buffer+0x13/0x20 resolve_indirect_refs+0x3e9/0xfc0 ? lock_downgrade+0x3d0/0x3d0 ? __kasan_check_read+0x11/0x20 ? add_prelim_ref.part.11+0x150/0x150 ? lock_downgrade+0x3d0/0x3d0 ? __kasan_check_read+0x11/0x20 ? lock_acquired+0xbb/0x600 ? __kasan_check_write+0x14/0x20 ? do_raw_spin_unlock+0xa8/0x140 ? rb_insert_color+0x30/0x360 ? prelim_ref_insert+0x12d/0x430 find_parent_nodes+0x5c3/0x1830 ? resolve_indirect_refs+0xfc0/0xfc0 ? lock_release+0xc8/0x620 ? fs_reclaim_acquire+0x67/0xf0 ? lock_acquire+0xc7/0x510 ? lock_downgrade+0x3d0/0x3d0 ? lockdep_hardirqs_on_prepare+0x160/0x210 ? lock_release+0xc8/0x620 ? fs_reclaim_acquire+0x67/0xf0 ? lock_acquire+0xc7/0x510 ? poison_range+0x38/0x40 ? unpoison_range+0x14/0x40 ? trace_hardirqs_on+0x55/0x120 btrfs_find_all_roots_safe+0x142/0x1e0 ? find_parent_nodes+0x1830/0x1830 ? btrfs_inode_flags_to_xflags+0x50/0x50 iterate_extent_inodes+0x20e/0x580 ? tree_backref_for_extent+0x230/0x230 ? lock_downgrade+0x3d0/0x3d0 ? read_extent_buffer+0xdd/0x110 ? lock_downgrade+0x3d0/0x3d0 ? __kasan_check_read+0x11/0x20 ? lock_acquired+0xbb/0x600 ? __kasan_check_write+0x14/0x20 ? _raw_spin_unlock+0x22/0x30 ? __kasan_check_write+0x14/0x20 iterate_inodes_from_logical+0x129/0x170 ? iterate_inodes_from_logical+0x129/0x170 ? btrfs_inode_flags_to_xflags+0x50/0x50 ? iterate_extent_inodes+0x580/0x580 ? __vmalloc_node+0x92/0xb0 ? init_data_container+0x34/0xb0 ? init_data_container+0x34/0xb0 ? kvmalloc_node+0x60/0x80 btrfs_ioctl_logical_to_ino+0x158/0x230 btrfs_ioctl+0x205e/0x4040 ? __might_sleep+0x71/0xe0 ? btrfs_ioctl_get_supported_features+0x30/0x30 ? getrusage+0x4b6/0x9c0 ? __kasan_check_read+0x11/0x20 ? lock_release+0xc8/0x620 ? __might_fault+0x64/0xd0 ? lock_acquire+0xc7/0x510 ? lock_downgrade+0x3d0/0x3d0 ? lockdep_hardirqs_on_prepare+0x210/0x210 ? lockdep_hardirqs_on_prepare+0x210/0x210 ? __kasan_check_read+0x11/0x20 ? do_vfs_ioctl+0xfc/0x9d0 ? ioctl_file_clone+0xe0/0xe0 ? lock_downgrade+0x3d0/0x3d0 ? lockdep_hardirqs_on_prepare+0x210/0x210 ? __kasan_check_read+0x11/0x20 ? lock_release+0xc8/0x620 ? __task_pid_nr_ns+0xd3/0x250 ? lock_acquire+0xc7/0x510 ? __fget_files+0x160/0x230 ? __fget_light+0xf2/0x110 __x64_sys_ioctl+0xc3/0x100 do_syscall_64+0x37/0x80 entry_SYSCALL_64_after_hwframe+0x44/0xa9 RIP: 0033:0x7fd1976e2427 Code: 00 00 90 48 8b 05 (...) RSP: 002b:00007fd1955e5cf8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 RAX: ffffffffffffffda RBX: 00007fd1955e5f40 RCX: 00007fd1976e2427 RDX: 00007fd1955e5f48 RSI: 00000000c038943b RDI: 0000000000000004 RBP: 0000000001000000 R08: 0000000000000000 R09: 00007fd1955e6120 R10: 0000557835366b00 R11: 0000000000000246 R12: 0000000000000004 R13: 00007fd1955e5f48 R14: 00007fd1955e5f40 R15: 00007fd1955e5ef8 Modules linked in: ---[ end trace ec8931a1c36e57be ]--- (gdb) l (__tree_mod_log_rewind+0x3b1) 0xffffffff81893521 is in __tree_mod_log_rewind (fs/btrfs/ctree.c:1210). 1205 the modification. as we're going backwards, we do the 1206 * opposite of each operation here. 1207 */ 1208 switch (tm->op) { 1209 case MOD_LOG_KEY_REMOVE_WHILE_FREEING: 1210 BUG_ON(tm->slot < n); 1211 fallthrough; 1212 case MOD_LOG_KEY_REMOVE_WHILE_MOVING: 1213 case MOD_LOG_KEY_REMOVE: 1214 btrfs_set_node_key(eb, &tm->key, tm->slot); Here's what happens to hit that BUG_ON(): 1) We have one tree mod log user (through fiemap or the logical ino ioctl), with a sequence number of 1, so we have fs_info->tree_mod_seq == 1; 2) Another task is at ctree.c:balance_level() and we have eb X currently as the root of the tree, and we promote its single child, eb Y, as the new root. Then, at ctree.c:balance_level(), we call: tree_mod_log_insert_root(eb X, eb Y, 1); 3) At tree_mod_log_insert_root() we create tree mod log elements for each slot of eb X, of operation type MOD_LOG_KEY_REMOVE_WHILE_FREEING each with a ->logical pointing to ebX->start. These are placed in an array named tm_list. Lets assume there are N elements (N pointers in eb X); 4) Then, still at tree_mod_log_insert_root(), we create a tree mod log element of operation type MOD_LOG_ROOT_REPLACE, ->logical set to ebY->start, ->old_root.logical set to ebX->start, ->old_root.level set to the level of eb X and ->generation set to the generation of eb X; 5) Then tree_mod_log_insert_root() calls tree_mod_log_free_eb() with tm_list as argument. After that, tree_mod_log_free_eb() calls __tree_mod_log_insert() for each member of tm_list in reverse order, from highest slot in eb X, slot N - 1, to slot 0 of eb X; 6) __tree_mod_log_insert() sets the sequence number of each given tree mod log operation - it increments fs_info->tree_mod_seq and sets fs_info->tree_mod_seq as the sequence number of the given tree mod log operation. This means that for the tm_list created at tree_mod_log_insert_root(), the element corresponding to slot 0 of eb X has the highest sequence number (1 + N), and the element corresponding to the last slot has the lowest sequence number (2); 7) Then, after inserting tm_list's elements into the tree mod log rbtree, the MOD_LOG_ROOT_REPLACE element is inserted, which gets the highest sequence number, which is N + 2; 8) Back to ctree.c:balance_level(), we free eb X by calling btrfs_free_tree_block() on it. Because eb X was created in the current transaction, has no other references and writeback did not happen for it, we add it back to the free space cache/tree; 9) Later some other task T allocates the metadata extent from eb X, since it is marked as free space in the space cache/tree, and uses it as a node for some other btree; 10) The tree mod log user task calls btrfs_search_old_slot(), which calls get_old_root(), and finally that calls __tree_mod_log_oldest_root() with time_seq == 1 and eb_root == eb Y; 11) First iteration of the while loop finds the tree mod log element with sequence number N + 2, for the logical address of eb Y and of type MOD_LOG_ROOT_REPLACE; 12) Because the operation type is MOD_LOG_ROOT_REPLACE, we don't break out of the loop, and set root_logical to point to tm->old_root.logical which corresponds to the logical address of eb X; 13) On the next iteration of the while loop, the call to tree_mod_log_search_oldest() returns the smallest tree mod log element for the logical address of eb X, which has a sequence number of 2, an operation type of MOD_LOG_KEY_REMOVE_WHILE_FREEING and corresponds to the old slot N - 1 of eb X (eb X had N items in it before being freed); 14) We then break out of the while loop and return the tree mod log operation of type MOD_LOG_ROOT_REPLACE (eb Y), and not the one for slot N - 1 of eb X, to get_old_root(); 15) At get_old_root(), we process the MOD_LOG_ROOT_REPLACE operation and set "logical" to the logical address of eb X, which was the old root. We then call tree_mod_log_search() passing it the logical address of eb X and time_seq == 1; 16) Then before calling tree_mod_log_search(), task T adds a key to eb X, which results in adding a tree mod log operation of type MOD_LOG_KEY_ADD to the tree mod log - this is done at ctree.c:insert_ptr() - but after adding the tree mod log operation and before updating the number of items in eb X from 0 to 1... 17) The task at get_old_root() calls tree_mod_log_search() and gets the tree mod log operation of type MOD_LOG_KEY_ADD just added by task T. Then it enters the following if branch: if (old_root && tm && tm->op != MOD_LOG_KEY_REMOVE_WHILE_FREEING) { (...) } (...) Calls read_tree_block() for eb X, which gets a reference on eb X but does not lock it - task T has it locked. Then it clones eb X while it has nritems set to 0 in its header, before task T sets nritems to 1 in eb X's header. From hereupon we use the clone of eb X which no other task has access to; 18) Then we call __tree_mod_log_rewind(), passing it the MOD_LOG_KEY_ADD mod log operation we just got from tree_mod_log_search() in the previous step and the cloned version of eb X; 19) At __tree_mod_log_rewind(), we set the local variable "n" to the number of items set in eb X's clone, which is 0. Then we enter the while loop, and in its first iteration we process the MOD_LOG_KEY_ADD operation, which just decrements "n" from 0 to (u32)-1, since "n" is declared with a type of u32. At the end of this iteration we call rb_next() to find the next tree mod log operation for eb X, that gives us the mod log operation of type MOD_LOG_KEY_REMOVE_WHILE_FREEING, for slot 0, with a sequence number of N + 1 (steps 3 to 6); 20) Then we go back to the top of the while loop and trigger the following BUG_ON(): (...) switch (tm->op) { case MOD_LOG_KEY_REMOVE_WHILE_FREEING: BUG_ON(tm->slot < n); fallthrough; (...) Because "n" has a value of (u32)-1 (4294967295) and tm->slot is 0. Fix this by taking a read lock on the extent buffer before cloning it at ctree.c:get_old_root(). This should be done regardless of the extent buffer having been freed and reused, as a concurrent task might be modifying it (while holding a write lock on it). Reported-by: Zygo Blaxell <ce3g8jdj@umail.furryterror.org> Link: https://lore.kernel.org/linux-btrfs/20210227155037.GN28049@hungrycats.org/ Fixes: `834328a849` ("Btrfs: tree mod log's old roots could still be part of the tree") CC: stable@vger.kernel.org # 4.4+ Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-03-25 09:04:06 +01:00
Chao Yu	78486cf1f3	zonefs: fix to update .i_wr_refcnt correctly in zonefs_open_zone() commit `6980d29ce4` upstream. In zonefs_open_zone(), if opened zone count is larger than .s_max_open_zones threshold, we missed to recover .i_wr_refcnt, fix this. Fixes: `b5c00e9757` ("zonefs: open/close zone on file open/close") Cc: <stable@vger.kernel.org> Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-03-25 09:04:05 +01:00

1 2 3 4 5 ...

68363 Commits