Commit Graph

56733 Commits

Author SHA1 Message Date
Greg Kroah-Hartman
8ca5759502 This is the 4.19.73 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAl1/KiEACgkQONu9yGCS
 aT49JBAAy7b3wv1WXAtg9wsyS1JL4HbMXt3YjtokIX+UpkznoqII4B85QftPBbiD
 9zDuTWPjhrqKv1GsMkFRCqBVp5wGVik1MIbjVuKdstFN5W8KQybpbYnSW4T52+wS
 cs6oOPkLydAfWzKeq+ekEeU8yr5dua+Ui3huundZ49wseJWQP3fh9T+ToUx8V/cr
 tsLiRRgI0djj7KQWVuM1j8YGKT/6qk/UL0HMVZyoIdLmsxpLap+LWe0+CRXn8rvs
 eJJlVQTVtYf/ySoHkpnwR12VsjRYjx6pNkm/GrebMCkM7wF/4RMqxk7j9EU0PENH
 VUdRrUd+j/YPp6QzjSFMK0+0eb7Gm3X0FEN0IGZshu1r/CDnoj/7hqnBmOlYIbhv
 pdteYaLqWq7JjAHu7vF+S4aNQRGpAZb05LsbTJ39Eu3FbdVTLXsAuUveZ7Y4/y0X
 ri2M3d/sF/cjc3C+V7Y7h422SM36jSAK6496VAoRyqqjX/3JyROhgfU9NAMzVr83
 4uI904z9lH4TZGOd5YQgX2VuOtBcGwa7+g6fy97u1tp8UxSWFZRGDDLRysF/dIJO
 Wi51UK0Q7EWnqBTe0TFF6TjE5tC7R3ZgzqEQ1MU4eLI5mqokg82DAK4Ub2Wk5Qch
 CGs5/d16OOrLtG2RoaOGz9UdQR7IHUXLSqkKbaEdstc16MXNXns=
 =cmGh
 -----END PGP SIGNATURE-----

Merge 4.19.73 into android-4.19

Changes in 4.19.73
	ALSA: hda - Fix potential endless loop at applying quirks
	ALSA: hda/realtek - Fix overridden device-specific initialization
	ALSA: hda/realtek - Add quirk for HP Pavilion 15
	ALSA: hda/realtek - Enable internal speaker & headset mic of ASUS UX431FL
	ALSA: hda/realtek - Fix the problem of two front mics on a ThinkCentre
	sched/fair: Don't assign runtime for throttled cfs_rq
	drm/vmwgfx: Fix double free in vmw_recv_msg()
	vhost/test: fix build for vhost test
	vhost/test: fix build for vhost test - again
	powerpc/tm: Fix FP/VMX unavailable exceptions inside a transaction
	batman-adv: fix uninit-value in batadv_netlink_get_ifindex()
	batman-adv: Only read OGM tvlv_len after buffer len check
	hv_sock: Fix hang when a connection is closed
	Blk-iolatency: warn on negative inflight IO counter
	blk-iolatency: fix STS_AGAIN handling
	{nl,mac}80211: fix interface combinations on crypto controlled devices
	timekeeping: Use proper ktime_add when adding nsecs in coarse offset
	selftests: fib_rule_tests: use pre-defined DEV_ADDR
	x86/ftrace: Fix warning and considate ftrace_jmp_replace() and ftrace_call_replace()
	powerpc/64: mark start_here_multiplatform as __ref
	media: stm32-dcmi: fix irq = 0 case
	arm64: dts: rockchip: enable usb-host regulators at boot on rk3328-rock64
	scripts/decode_stacktrace: match basepath using shell prefix operator, not regex
	riscv: remove unused variable in ftrace
	nvme-fc: use separate work queue to avoid warning
	clk: s2mps11: Add used attribute to s2mps11_dt_match
	remoteproc: qcom: q6v5: shore up resource probe handling
	modules: always page-align module section allocations
	kernel/module: Fix mem leak in module_add_modinfo_attrs
	drm/i915: Re-apply "Perform link quality check, unconditionally during long pulse"
	media: cec/v4l2: move V4L2 specific CEC functions to V4L2
	media: cec: remove cec-edid.c
	scsi: qla2xxx: Move log messages before issuing command to firmware
	keys: Fix the use of the C++ keyword "private" in uapi/linux/keyctl.h
	Drivers: hv: kvp: Fix two "this statement may fall through" warnings
	x86, hibernate: Fix nosave_regions setup for hibernation
	remoteproc: qcom: q6v5-mss: add SCM probe dependency
	drm/amdgpu/gfx9: Update gfx9 golden settings.
	drm/amdgpu: Update gc_9_0 golden settings.
	KVM: x86: hyperv: enforce vp_index < KVM_MAX_VCPUS
	KVM: x86: hyperv: consistently use 'hv_vcpu' for 'struct kvm_vcpu_hv' variables
	KVM: x86: hyperv: keep track of mismatched VP indexes
	KVM: hyperv: define VP assist page helpers
	x86/kvm/lapic: preserve gfn_to_hva_cache len on cache reinit
	drm/i915: Fix intel_dp_mst_best_encoder()
	drm/i915: Rename PLANE_CTL_DECOMPRESSION_ENABLE
	drm/i915/gen9+: Fix initial readout for Y tiled framebuffers
	drm/atomic_helper: Disallow new modesets on unregistered connectors
	Drivers: hv: kvp: Fix the indentation of some "break" statements
	Drivers: hv: kvp: Fix the recent regression caused by incorrect clean-up
	powerplay: Respect units on max dcfclk watermark
	drm/amd/pp: Fix truncated clock value when set watermark
	drm/amd/dm: Understand why attaching path/tile properties are needed
	ARM: davinci: da8xx: define gpio interrupts as separate resources
	ARM: davinci: dm365: define gpio interrupts as separate resources
	ARM: davinci: dm646x: define gpio interrupts as separate resources
	ARM: davinci: dm355: define gpio interrupts as separate resources
	ARM: davinci: dm644x: define gpio interrupts as separate resources
	s390/zcrypt: reinit ap queue state machine during device probe
	media: vim2m: use workqueue
	media: vim2m: use cancel_delayed_work_sync instead of flush_schedule_work
	drm/i915: Restore sane defaults for KMS on GEM error load
	drm/i915: Cleanup gt powerstate from gem
	KVM: PPC: Book3S HV: Fix race between kvm_unmap_hva_range and MMU mode switch
	Btrfs: clean up scrub is_dev_replace parameter
	Btrfs: fix deadlock with memory reclaim during scrub
	btrfs: Remove extent_io_ops::fill_delalloc
	btrfs: Fix error handling in btrfs_cleanup_ordered_extents
	scsi: megaraid_sas: Fix combined reply queue mode detection
	scsi: megaraid_sas: Add check for reset adapter bit
	scsi: megaraid_sas: Use 63-bit DMA addressing
	powerpc/pkeys: Fix handling of pkey state across fork()
	btrfs: volumes: Make sure no dev extent is beyond device boundary
	btrfs: Use real device structure to verify dev extent
	media: vim2m: only cancel work if it is for right context
	ARC: show_regs: lockdep: re-enable preemption
	ARC: mm: do_page_fault fixes #1: relinquish mmap_sem if signal arrives while handle_mm_fault
	IB/uverbs: Fix OOPs upon device disassociation
	crypto: ccree - fix resume race condition on init
	crypto: ccree - add missing inline qualifier
	drm/vblank: Allow dynamic per-crtc max_vblank_count
	drm/i915/ilk: Fix warning when reading emon_status with no output
	mfd: Kconfig: Fix I2C_DESIGNWARE_PLATFORM dependencies
	tpm: Fix some name collisions with drivers/char/tpm.h
	bcache: replace hard coded number with BUCKET_GC_GEN_MAX
	bcache: treat stale && dirty keys as bad keys
	KVM: VMX: Compare only a single byte for VMCS' "launched" in vCPU-run
	iio: adc: exynos-adc: Add S5PV210 variant
	dt-bindings: iio: adc: exynos-adc: Add S5PV210 variant
	iio: adc: exynos-adc: Use proper number of channels for Exynos4x12
	mt76: fix corrupted software generated tx CCMP PN
	drm/nouveau: Don't WARN_ON VCPI allocation failures
	iwlwifi: fix devices with PCI Device ID 0x34F0 and 11ac RF modules
	iwlwifi: add new card for 9260 series
	x86/kvmclock: set offset for kvm unstable clock
	spi: spi-gpio: fix SPI_CS_HIGH capability
	powerpc/kvm: Save and restore host AMR/IAMR/UAMOR
	mmc: renesas_sdhi: Fix card initialization failure in high speed mode
	btrfs: scrub: pass fs_info to scrub_setup_ctx
	btrfs: scrub: move scrub_setup_ctx allocation out of device_list_mutex
	btrfs: scrub: fix circular locking dependency warning
	btrfs: init csum_list before possible free
	PCI: qcom: Fix error handling in runtime PM support
	PCI: qcom: Don't deassert reset GPIO during probe
	drm: add __user attribute to ptr_to_compat()
	CIFS: Fix error paths in writeback code
	CIFS: Fix leaking locked VFS cache pages in writeback retry
	drm/i915: Handle vm_mmap error during I915_GEM_MMAP ioctl with WC set
	drm/i915: Sanity check mmap length against object size
	usb: typec: tcpm: Try PD-2.0 if sink does not respond to 3.0 source-caps
	arm64: dts: stratix10: add the sysmgr-syscon property from the gmac's
	IB/mlx5: Reset access mask when looping inside page fault handler
	kvm: mmu: Fix overflow on kvm mmu page limit calculation
	x86/kvm: move kvm_load/put_guest_xcr0 into atomic context
	KVM: x86: Always use 32-bit SMRAM save state for 32-bit kernels
	cifs: Fix lease buffer length error
	media: i2c: tda1997x: select V4L2_FWNODE
	ext4: protect journal inode's blocks using block_validity
	ARM: dts: qcom: ipq4019: fix PCI range
	ARM: dts: qcom: ipq4019: Fix MSI IRQ type
	ARM: dts: qcom: ipq4019: enlarge PCIe BAR range
	dt-bindings: mmc: Add supports-cqe property
	dt-bindings: mmc: Add disable-cqe-dcmd property.
	PCI: Add macro for Switchtec quirk declarations
	PCI: Reset Lenovo ThinkPad P50 nvgpu at boot if necessary
	dm mpath: fix missing call of path selector type->end_io
	blk-mq: free hw queue's resource in hctx's release handler
	mmc: sdhci-pci: Add support for Intel CML
	PCI: dwc: Use devm_pci_alloc_host_bridge() to simplify code
	cifs: smbd: take an array of reqeusts when sending upper layer data
	dm crypt: move detailed message into debug level
	signal/arc: Use force_sig_fault where appropriate
	ARC: mm: fix uninitialised signal code in do_page_fault
	ARC: mm: SIGSEGV userspace trying to access kernel virtual memory
	drm/amdkfd: Add missing Polaris10 ID
	kvm: Check irqchip mode before assign irqfd
	drm/amdgpu: fix ring test failure issue during s3 in vce 3.0 (V2)
	drm/amdgpu/{uvd,vcn}: fetch ring's read_ptr after alloc
	Btrfs: fix race between block group removal and block group allocation
	cifs: add spinlock for the openFileList to cifsInodeInfo
	clk: tegra: Fix maximum audio sync clock for Tegra124/210
	clk: tegra210: Fix default rates for HDA clocks
	IB/hfi1: Avoid hardlockup with flushlist_lock
	apparmor: reset pos on failure to unpack for various functions
	scsi: target/core: Use the SECTOR_SHIFT constant
	scsi: target/iblock: Fix overrun in WRITE SAME emulation
	staging: wilc1000: fix error path cleanup in wilc_wlan_initialize()
	scsi: zfcp: fix request object use-after-free in send path causing wrong traces
	cifs: Properly handle auto disabling of serverino option
	ALSA: hda - Don't resume forcibly i915 HDMI/DP codec
	ceph: use ceph_evict_inode to cleanup inode's resource
	KVM: x86: optimize check for valid PAT value
	KVM: VMX: Always signal #GP on WRMSR to MSR_IA32_CR_PAT with bad value
	KVM: VMX: Fix handling of #MC that occurs during VM-Entry
	KVM: VMX: check CPUID before allowing read/write of IA32_XSS
	KVM: PPC: Use ccr field in pt_regs struct embedded in vcpu struct
	KVM: PPC: Book3S HV: Fix CR0 setting in TM emulation
	ARM: dts: gemini: Set DIR-685 SPI CS as active low
	RDMA/srp: Document srp_parse_in() arguments
	RDMA/srp: Accept again source addresses that do not have a port number
	btrfs: correctly validate compression type
	resource: Include resource end in walk_*() interfaces
	resource: Fix find_next_iomem_res() iteration issue
	resource: fix locking in find_next_iomem_res()
	pstore: Fix double-free in pstore_mkfile() failure path
	dm thin metadata: check if in fail_io mode when setting needs_check
	drm/panel: Add support for Armadeus ST0700 Adapt
	ALSA: hda - Fix intermittent CORB/RIRB stall on Intel chips
	powerpc/mm: Limit rma_size to 1TB when running without HV mode
	iommu/iova: Remove stale cached32_node
	gpio: don't WARN() on NULL descs if gpiolib is disabled
	i2c: at91: disable TXRDY interrupt after sending data
	i2c: at91: fix clk_offset for sama5d2
	mm/migrate.c: initialize pud_entry in migrate_vma()
	iio: adc: gyroadc: fix uninitialized return code
	NFSv4: Fix delegation state recovery
	bcache: only clear BTREE_NODE_dirty bit when it is set
	bcache: add comments for mutex_lock(&b->write_lock)
	bcache: fix race in btree_flush_write()
	drm/i915: Make sure cdclk is high enough for DP audio on VLV/CHV
	virtio/s390: fix race on airq_areas[]
	drm/atomic_helper: Allow DPMS On<->Off changes for unregistered connectors
	ext4: don't perform block validity checks on the journal inode
	ext4: fix block validity checks for journal inodes using indirect blocks
	ext4: unsigned int compared against zero
	PCI: Reset both NVIDIA GPU and HDA in ThinkPad P50 workaround
	powerpc/tm: Remove msr_tm_active()
	powerpc/tm: Fix restoring FP/VMX facility incorrectly on interrupts
	vhost: make sure log_num < in_num
	Linux 4.19.73

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I7bc57825aeb36759bb8e8726888da9af06392c09
2019-09-16 09:35:02 +02:00
Colin Ian King
ff69322509 ext4: unsigned int compared against zero
[ Upstream commit fbbbbd2f28 ]

There are two cases where u32 variables n and err are being checked
for less than zero error values, the checks is always false because
the variables are not signed. Fix this by making the variables ints.

Addresses-Coverity: ("Unsigned compared against 0")
Fixes: 345c0dbf3a ("ext4: protect journal inode's blocks using block_validity")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-09-16 08:22:24 +02:00
Theodore Ts'o
292666d2d8 ext4: fix block validity checks for journal inodes using indirect blocks
[ Upstream commit 170417c8c7 ]

Commit 345c0dbf3a ("ext4: protect journal inode's blocks using
block_validity") failed to add an exception for the journal inode in
ext4_check_blockref(), which is the function used by ext4_get_branch()
for indirect blocks.  This caused attempts to read from the ext3-style
journals to fail with:

[  848.968550] EXT4-fs error (device sdb7): ext4_get_branch:171: inode #8: block 30343695: comm jbd2/sdb7-8: invalid block

Fix this by adding the missing exception check.

Fixes: 345c0dbf3a ("ext4: protect journal inode's blocks using block_validity")
Reported-by: Arthur Marsh <arthur.marsh@internode.on.net>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-09-16 08:22:24 +02:00
Theodore Ts'o
97fbf57346 ext4: don't perform block validity checks on the journal inode
[ Upstream commit 0a944e8a6c ]

Since the journal inode is already checked when we added it to the
block validity's system zone, if we check it again, we'll just trigger
a failure.

This was causing failures like this:

[   53.897001] EXT4-fs error (device sda): ext4_find_extent:909: inode
#8: comm jbd2/sda-8: pblk 121667583 bad header/extent: invalid extent entries - magic f30a, entries 8, max 340(340), depth 0(0)
[   53.931430] jbd2_journal_bmap: journal block not found at offset 49 on sda-8
[   53.938480] Aborting journal on device sda-8.

... but only if the system was under enough memory pressure that
logical->physical mapping for the journal inode gets pushed out of the
extent cache.  (This is why it wasn't noticed earlier.)

Fixes: 345c0dbf3a ("ext4: protect journal inode's blocks using block_validity")
Reported-by: Dan Rue <dan.rue@linaro.org>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Tested-by: Naresh Kamboju <naresh.kamboju@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-09-16 08:22:24 +02:00
Trond Myklebust
652993a5aa NFSv4: Fix delegation state recovery
[ Upstream commit 5eb8d18ca0 ]

Once we clear the NFS_DELEGATED_STATE flag, we're telling
nfs_delegation_claim_opens() that we're done recovering all open state
for that stateid, so we really need to ensure that we test for all
open modes that are currently cached and recover them before exiting
nfs4_open_delegation_recall().

Fixes: 24311f8841 ("NFSv4: Recovery of recalled read delegations...")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Cc: stable@vger.kernel.org # v4.3+
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-09-16 08:22:22 +02:00
Norbert Manthey
5e9a2ce6d3 pstore: Fix double-free in pstore_mkfile() failure path
[ Upstream commit 4c6d80e114 ]

The pstore_mkfile() function is passed a pointer to a struct
pstore_record. On success it consumes this 'record' pointer and
references it from the created inode.

On failure, however, it may or may not free the record. There are even
two different code paths which return -ENOMEM -- one of which does and
the other doesn't free the record.

Make the behaviour deterministic by never consuming and freeing the
record when returning failure, allowing the caller to do the cleanup
consistently.

Signed-off-by: Norbert Manthey <nmanthey@amazon.de>
Link: https://lore.kernel.org/r/1562331960-26198-1-git-send-email-nmanthey@amazon.de
Fixes: 83f70f0769 ("pstore: Do not duplicate record metadata")
Fixes: 1dfff7dd67 ("pstore: Pass record contents instead of copying")
Cc: stable@vger.kernel.org
[kees: also move "private" allocation location, rename inode cleanup label]
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-09-16 08:22:20 +02:00
Johannes Thumshirn
1c13c9c40e btrfs: correctly validate compression type
[ Upstream commit aa53e3bfac ]

Nikolay reported the following KASAN splat when running btrfs/048:

[ 1843.470920] ==================================================================
[ 1843.471971] BUG: KASAN: slab-out-of-bounds in strncmp+0x66/0xb0
[ 1843.472775] Read of size 1 at addr ffff888111e369e2 by task btrfs/3979

[ 1843.473904] CPU: 3 PID: 3979 Comm: btrfs Not tainted 5.2.0-rc3-default #536
[ 1843.475009] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014
[ 1843.476322] Call Trace:
[ 1843.476674]  dump_stack+0x7c/0xbb
[ 1843.477132]  ? strncmp+0x66/0xb0
[ 1843.477587]  print_address_description+0x114/0x320
[ 1843.478256]  ? strncmp+0x66/0xb0
[ 1843.478740]  ? strncmp+0x66/0xb0
[ 1843.479185]  __kasan_report+0x14e/0x192
[ 1843.479759]  ? strncmp+0x66/0xb0
[ 1843.480209]  kasan_report+0xe/0x20
[ 1843.480679]  strncmp+0x66/0xb0
[ 1843.481105]  prop_compression_validate+0x24/0x70
[ 1843.481798]  btrfs_xattr_handler_set_prop+0x65/0x160
[ 1843.482509]  __vfs_setxattr+0x71/0x90
[ 1843.483012]  __vfs_setxattr_noperm+0x84/0x130
[ 1843.483606]  vfs_setxattr+0xac/0xb0
[ 1843.484085]  setxattr+0x18c/0x230
[ 1843.484546]  ? vfs_setxattr+0xb0/0xb0
[ 1843.485048]  ? __mod_node_page_state+0x1f/0xa0
[ 1843.485672]  ? _raw_spin_unlock+0x24/0x40
[ 1843.486233]  ? __handle_mm_fault+0x988/0x1290
[ 1843.486823]  ? lock_acquire+0xb4/0x1e0
[ 1843.487330]  ? lock_acquire+0xb4/0x1e0
[ 1843.487842]  ? mnt_want_write_file+0x3c/0x80
[ 1843.488442]  ? debug_lockdep_rcu_enabled+0x22/0x40
[ 1843.489089]  ? rcu_sync_lockdep_assert+0xe/0x70
[ 1843.489707]  ? __sb_start_write+0x158/0x200
[ 1843.490278]  ? mnt_want_write_file+0x3c/0x80
[ 1843.490855]  ? __mnt_want_write+0x98/0xe0
[ 1843.491397]  __x64_sys_fsetxattr+0xba/0xe0
[ 1843.492201]  ? trace_hardirqs_off_thunk+0x1a/0x1c
[ 1843.493201]  do_syscall_64+0x6c/0x230
[ 1843.493988]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
[ 1843.495041] RIP: 0033:0x7fa7a8a7707a
[ 1843.495819] Code: 48 8b 0d 21 de 2b 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 49 89 ca b8 be 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d ee dd 2b 00 f7 d8 64 89 01 48
[ 1843.499203] RSP: 002b:00007ffcb73bca38 EFLAGS: 00000202 ORIG_RAX: 00000000000000be
[ 1843.500210] RAX: ffffffffffffffda RBX: 00007ffcb73bda9d RCX: 00007fa7a8a7707a
[ 1843.501170] RDX: 00007ffcb73bda9d RSI: 00000000006dc050 RDI: 0000000000000003
[ 1843.502152] RBP: 00000000006dc050 R08: 0000000000000000 R09: 0000000000000000
[ 1843.503109] R10: 0000000000000002 R11: 0000000000000202 R12: 00007ffcb73bda91
[ 1843.504055] R13: 0000000000000003 R14: 00007ffcb73bda82 R15: ffffffffffffffff

[ 1843.505268] Allocated by task 3979:
[ 1843.505771]  save_stack+0x19/0x80
[ 1843.506211]  __kasan_kmalloc.constprop.5+0xa0/0xd0
[ 1843.506836]  setxattr+0xeb/0x230
[ 1843.507264]  __x64_sys_fsetxattr+0xba/0xe0
[ 1843.507886]  do_syscall_64+0x6c/0x230
[ 1843.508429]  entry_SYSCALL_64_after_hwframe+0x49/0xbe

[ 1843.509558] Freed by task 0:
[ 1843.510188] (stack is not available)

[ 1843.511309] The buggy address belongs to the object at ffff888111e369e0
                which belongs to the cache kmalloc-8 of size 8
[ 1843.514095] The buggy address is located 2 bytes inside of
                8-byte region [ffff888111e369e0, ffff888111e369e8)
[ 1843.516524] The buggy address belongs to the page:
[ 1843.517561] page:ffff88813f478d80 refcount:1 mapcount:0 mapping:ffff88811940c300 index:0xffff888111e373b8 compound_mapcount: 0
[ 1843.519993] flags: 0x4404000010200(slab|head)
[ 1843.520951] raw: 0004404000010200 ffff88813f48b008 ffff888119403d50 ffff88811940c300
[ 1843.522616] raw: ffff888111e373b8 000000000016000f 00000001ffffffff 0000000000000000
[ 1843.524281] page dumped because: kasan: bad access detected

[ 1843.525936] Memory state around the buggy address:
[ 1843.526975]  ffff888111e36880: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[ 1843.528479]  ffff888111e36900: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[ 1843.530138] >ffff888111e36980: fc fc fc fc fc fc fc fc fc fc fc fc 02 fc fc fc
[ 1843.531877]                                                        ^
[ 1843.533287]  ffff888111e36a00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[ 1843.534874]  ffff888111e36a80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[ 1843.536468] ==================================================================

This is caused by supplying a too short compression value ('lz') in the
test-case and comparing it to 'lzo' with strncmp() and a length of 3.
strncmp() read past the 'lz' when looking for the 'o' and thus caused an
out-of-bounds read.

Introduce a new check 'btrfs_compress_is_valid_type()' which not only
checks the user-supplied value against known compression types, but also
employs checks for too short values.

Reported-by: Nikolay Borisov <nborisov@suse.com>
Fixes: 272e5326c7 ("btrfs: prop: fix vanished compression property after failed set")
CC: stable@vger.kernel.org # 5.1+
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-09-16 08:22:19 +02:00
Yan, Zheng
8128103999 ceph: use ceph_evict_inode to cleanup inode's resource
[ Upstream commit 87bc5b895d ]

remove_session_caps() relies on __wait_on_freeing_inode(), to wait for
freeing inode to remove its caps. But VFS wakes freeing inode waiters
before calling destroy_inode().

Cc: stable@vger.kernel.org
Link: https://tracker.ceph.com/issues/40102
Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-09-16 08:22:17 +02:00
Paulo Alcantara (SUSE)
987564c28e cifs: Properly handle auto disabling of serverino option
[ Upstream commit 29fbeb7a90 ]

Fix mount options comparison when serverino option is turned off later
in cifs_autodisable_serverino() and thus avoiding mismatch of new cifs
mounts.

Cc: stable@vger.kernel.org
Signed-off-by: Paulo Alcantara (SUSE) <paulo@paulo.ac>
Signed-off-by: Steve French <stfrench@microsoft.com>
Reviewed-by: Pavel Shilovsky <pshilove@microsoft.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-09-16 08:22:17 +02:00
Ronnie Sahlberg
acc07941e2 cifs: add spinlock for the openFileList to cifsInodeInfo
[ Upstream commit 487317c994 ]

We can not depend on the tcon->open_file_lock here since in multiuser mode
we may have the same file/inode open via multiple different tcons.

The current code is race prone and will crash if one user deletes a file
at the same time a different user opens/create the file.

To avoid this we need to have a spinlock attached to the inode and not the tcon.

RHBZ:  1580165

CC: Stable <stable@vger.kernel.org>
Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-09-16 08:22:15 +02:00
Filipe Manana
1d0648767c Btrfs: fix race between block group removal and block group allocation
[ Upstream commit 8eaf40c0e2 ]

If a task is removing the block group that currently has the highest start
offset amongst all existing block groups, there is a short time window
where it races with a concurrent block group allocation, resulting in a
transaction abort with an error code of EEXIST.

The following diagram explains the race in detail:

      Task A                                                        Task B

 btrfs_remove_block_group(bg offset X)

   remove_extent_mapping(em offset X)
     -> removes extent map X from the
        tree of extent maps
        (fs_info->mapping_tree), so the
        next call to find_next_chunk()
        will return offset X

                                                   btrfs_alloc_chunk()
                                                     find_next_chunk()
                                                       --> returns offset X

                                                     __btrfs_alloc_chunk(offset X)
                                                       btrfs_make_block_group()
                                                         btrfs_create_block_group_cache()
                                                           --> creates btrfs_block_group_cache
                                                               object with a key corresponding
                                                               to the block group item in the
                                                               extent, the key is:
                                                               (offset X, BTRFS_BLOCK_GROUP_ITEM_KEY, 1G)

                                                         --> adds the btrfs_block_group_cache object
                                                             to the list new_bgs of the transaction
                                                             handle

                                                   btrfs_end_transaction(trans handle)
                                                     __btrfs_end_transaction()
                                                       btrfs_create_pending_block_groups()
                                                         --> sees the new btrfs_block_group_cache
                                                             in the new_bgs list of the transaction
                                                             handle
                                                         --> its call to btrfs_insert_item() fails
                                                             with -EEXIST when attempting to insert
                                                             the block group item key
                                                             (offset X, BTRFS_BLOCK_GROUP_ITEM_KEY, 1G)
                                                             because task A has not removed that key yet
                                                         --> aborts the running transaction with
                                                             error -EEXIST

   btrfs_del_item()
     -> removes the block group's key from
        the extent tree, key is
        (offset X, BTRFS_BLOCK_GROUP_ITEM_KEY, 1G)

A sample transaction abort trace:

  [78912.403537] ------------[ cut here ]------------
  [78912.403811] BTRFS: Transaction aborted (error -17)
  [78912.404082] WARNING: CPU: 2 PID: 20465 at fs/btrfs/extent-tree.c:10551 btrfs_create_pending_block_groups+0x196/0x250 [btrfs]
  (...)
  [78912.405642] CPU: 2 PID: 20465 Comm: btrfs Tainted: G        W         5.0.0-btrfs-next-46 #1
  [78912.405941] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.11.2-0-gf9626ccb91-prebuilt.qemu-project.org 04/01/2014
  [78912.406586] RIP: 0010:btrfs_create_pending_block_groups+0x196/0x250 [btrfs]
  (...)
  [78912.407636] RSP: 0018:ffff9d3d4b7e3b08 EFLAGS: 00010282
  [78912.407997] RAX: 0000000000000000 RBX: ffff90959a3796f0 RCX: 0000000000000006
  [78912.408369] RDX: 0000000000000007 RSI: 0000000000000001 RDI: ffff909636b16860
  [78912.408746] RBP: ffff909626758a58 R08: 0000000000000000 R09: 0000000000000000
  [78912.409144] R10: ffff9095ff462400 R11: 0000000000000000 R12: ffff90959a379588
  [78912.409521] R13: ffff909626758ab0 R14: ffff9095036c0000 R15: ffff9095299e1158
  [78912.409899] FS:  00007f387f16f700(0000) GS:ffff909636b00000(0000) knlGS:0000000000000000
  [78912.410285] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  [78912.410673] CR2: 00007f429fc87cbc CR3: 000000014440a004 CR4: 00000000003606e0
  [78912.411095] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  [78912.411496] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
  [78912.411898] Call Trace:
  [78912.412318]  __btrfs_end_transaction+0x5b/0x1c0 [btrfs]
  [78912.412746]  btrfs_inc_block_group_ro+0xcf/0x160 [btrfs]
  [78912.413179]  scrub_enumerate_chunks+0x188/0x5b0 [btrfs]
  [78912.413622]  ? __mutex_unlock_slowpath+0x100/0x2a0
  [78912.414078]  btrfs_scrub_dev+0x2ef/0x720 [btrfs]
  [78912.414535]  ? __sb_start_write+0xd4/0x1c0
  [78912.414963]  ? mnt_want_write_file+0x24/0x50
  [78912.415403]  btrfs_ioctl+0x17fb/0x3120 [btrfs]
  [78912.415832]  ? lock_acquire+0xa6/0x190
  [78912.416256]  ? do_vfs_ioctl+0xa2/0x6f0
  [78912.416685]  ? btrfs_ioctl_get_supported_features+0x30/0x30 [btrfs]
  [78912.417116]  do_vfs_ioctl+0xa2/0x6f0
  [78912.417534]  ? __fget+0x113/0x200
  [78912.417954]  ksys_ioctl+0x70/0x80
  [78912.418369]  __x64_sys_ioctl+0x16/0x20
  [78912.418812]  do_syscall_64+0x60/0x1b0
  [78912.419231]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
  [78912.419644] RIP: 0033:0x7f3880252dd7
  (...)
  [78912.420957] RSP: 002b:00007f387f16ed68 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
  [78912.421426] RAX: ffffffffffffffda RBX: 000055f5becc1df0 RCX: 00007f3880252dd7
  [78912.421889] RDX: 000055f5becc1df0 RSI: 00000000c400941b RDI: 0000000000000003
  [78912.422354] RBP: 0000000000000000 R08: 00007f387f16f700 R09: 0000000000000000
  [78912.422790] R10: 00007f387f16f700 R11: 0000000000000246 R12: 0000000000000000
  [78912.423202] R13: 00007ffda49c266f R14: 0000000000000000 R15: 00007f388145e040
  [78912.425505] ---[ end trace eb9bfe7c426fc4d3 ]---

Fix this by calling remove_extent_mapping(), at btrfs_remove_block_group(),
only at the very end, after removing the block group item key from the
extent tree (and removing the free space tree entry if we are using the
free space tree feature).

Fixes: 04216820fe ("Btrfs: fix race between fs trimming and block group remove/allocation")
CC: stable@vger.kernel.org # 4.4+
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-09-16 08:22:15 +02:00
Long Li
96b44c20e6 cifs: smbd: take an array of reqeusts when sending upper layer data
[ Upstream commit 4739f23286 ]

To support compounding, __smb_send_rqst() now sends an array of requests to
the transport layer.
Change smbd_send() to take an array of requests, and send them in as few
packets as possible.

Signed-off-by: Long Li <longli@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
CC: Stable <stable@vger.kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-09-16 08:22:13 +02:00
Theodore Ts'o
2fd4629de5 ext4: protect journal inode's blocks using block_validity
[ Upstream commit 345c0dbf3a ]

Add the blocks which belong to the journal inode to block_validity's
system zone so attempts to deallocate or overwrite the journal due a
corrupted file system where the journal blocks are also claimed by
another inode.

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=202879
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-09-16 08:22:11 +02:00
ZhangXiaoxu
4061e662c8 cifs: Fix lease buffer length error
[ Upstream commit b57a55e220 ]

There is a KASAN slab-out-of-bounds:
BUG: KASAN: slab-out-of-bounds in _copy_from_iter_full+0x783/0xaa0
Read of size 80 at addr ffff88810c35e180 by task mount.cifs/539

CPU: 1 PID: 539 Comm: mount.cifs Not tainted 4.19 #10
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
            rel-1.12.0-0-ga698c8995f-prebuilt.qemu.org 04/01/2014
Call Trace:
 dump_stack+0xdd/0x12a
 print_address_description+0xa7/0x540
 kasan_report+0x1ff/0x550
 check_memory_region+0x2f1/0x310
 memcpy+0x2f/0x80
 _copy_from_iter_full+0x783/0xaa0
 tcp_sendmsg_locked+0x1840/0x4140
 tcp_sendmsg+0x37/0x60
 inet_sendmsg+0x18c/0x490
 sock_sendmsg+0xae/0x130
 smb_send_kvec+0x29c/0x520
 __smb_send_rqst+0x3ef/0xc60
 smb_send_rqst+0x25a/0x2e0
 compound_send_recv+0x9e8/0x2af0
 cifs_send_recv+0x24/0x30
 SMB2_open+0x35e/0x1620
 open_shroot+0x27b/0x490
 smb2_open_op_close+0x4e1/0x590
 smb2_query_path_info+0x2ac/0x650
 cifs_get_inode_info+0x1058/0x28f0
 cifs_root_iget+0x3bb/0xf80
 cifs_smb3_do_mount+0xe00/0x14c0
 cifs_do_mount+0x15/0x20
 mount_fs+0x5e/0x290
 vfs_kern_mount+0x88/0x460
 do_mount+0x398/0x31e0
 ksys_mount+0xc6/0x150
 __x64_sys_mount+0xea/0x190
 do_syscall_64+0x122/0x590
 entry_SYSCALL_64_after_hwframe+0x44/0xa9

It can be reproduced by the following step:
  1. samba configured with: server max protocol = SMB2_10
  2. mount -o vers=default

When parse the mount version parameter, the 'ops' and 'vals'
was setted to smb30,  if negotiate result is smb21, just
update the 'ops' to smb21, but the 'vals' is still smb30.
When add lease context, the iov_base is allocated with smb21
ops, but the iov_len is initiallited with the smb30. Because
the iov_len is longer than iov_base, when send the message,
copy array out of bounds.

we need to keep the 'ops' and 'vals' consistent.

Fixes: 9764c02fcb ("SMB3: Add support for multidialect negotiate (SMB2.1 and later)")
Fixes: d5c7076b77 ("smb3: add smb3.1.1 to default dialect list")

Signed-off-by: ZhangXiaoxu <zhangxiaoxu5@huawei.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
CC: Stable <stable@vger.kernel.org>
Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-09-16 08:22:10 +02:00
Pavel Shilovsky
778d626c6a CIFS: Fix leaking locked VFS cache pages in writeback retry
[ Upstream commit 165df9a080 ]

If we don't find a writable file handle when retrying writepages
we break of the loop and do not unlock and put pages neither from
wdata2 nor from the original wdata. Fix this by walking through
all the remaining pages and cleanup them properly.

Cc: <stable@vger.kernel.org>
Signed-off-by: Pavel Shilovsky <pshilov@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-09-16 08:22:09 +02:00
Pavel Shilovsky
fb2dabeabb CIFS: Fix error paths in writeback code
[ Upstream commit 9a66396f18 ]

This patch aims to address writeback code problems related to error
paths. In particular it respects EINTR and related error codes and
stores and returns the first error occurred during writeback.

Signed-off-by: Pavel Shilovsky <pshilov@microsoft.com>
Acked-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-09-16 08:22:09 +02:00
Dan Robertson
476ecc14cf btrfs: init csum_list before possible free
[ Upstream commit e49be14b8d ]

The scrub_ctx csum_list member must be initialized before scrub_free_ctx
is called. If the csum_list is not initialized beforehand, the
list_empty call in scrub_free_csums will result in a null deref if the
allocation fails in the for loop.

Fixes: a2de733c78 ("btrfs: scrub")
CC: stable@vger.kernel.org # 3.0+
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Dan Robertson <dan@dlrobertson.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-09-16 08:22:08 +02:00
Anand Jain
936690bdd8 btrfs: scrub: fix circular locking dependency warning
[ Upstream commit 1cec3f2716 ]

This fixes a longstanding lockdep warning triggered by
fstests/btrfs/011.

Circular locking dependency check reports warning[1], that's because the
btrfs_scrub_dev() calls the stack #0 below with, the fs_info::scrub_lock
held. The test case leading to this warning:

  $ mkfs.btrfs -f /dev/sdb
  $ mount /dev/sdb /btrfs
  $ btrfs scrub start -B /btrfs

In fact we have fs_info::scrub_workers_refcnt to track if the init and destroy
of the scrub workers are needed. So once we have incremented and decremented
the fs_info::scrub_workers_refcnt value in the thread, its ok to drop the
scrub_lock, and then actually do the btrfs_destroy_workqueue() part. So this
patch drops the scrub_lock before calling btrfs_destroy_workqueue().

  [359.258534] ======================================================
  [359.260305] WARNING: possible circular locking dependency detected
  [359.261938] 5.0.0-rc6-default #461 Not tainted
  [359.263135] ------------------------------------------------------
  [359.264672] btrfs/20975 is trying to acquire lock:
  [359.265927] 00000000d4d32bea ((wq_completion)"%s-%s""btrfs", name){+.+.}, at: flush_workqueue+0x87/0x540
  [359.268416]
  [359.268416] but task is already holding lock:
  [359.270061] 0000000053ea26a6 (&fs_info->scrub_lock){+.+.}, at: btrfs_scrub_dev+0x322/0x590 [btrfs]
  [359.272418]
  [359.272418] which lock already depends on the new lock.
  [359.272418]
  [359.274692]
  [359.274692] the existing dependency chain (in reverse order) is:
  [359.276671]
  [359.276671] -> #3 (&fs_info->scrub_lock){+.+.}:
  [359.278187]        __mutex_lock+0x86/0x9c0
  [359.279086]        btrfs_scrub_pause+0x31/0x100 [btrfs]
  [359.280421]        btrfs_commit_transaction+0x1e4/0x9e0 [btrfs]
  [359.281931]        close_ctree+0x30b/0x350 [btrfs]
  [359.283208]        generic_shutdown_super+0x64/0x100
  [359.284516]        kill_anon_super+0x14/0x30
  [359.285658]        btrfs_kill_super+0x12/0xa0 [btrfs]
  [359.286964]        deactivate_locked_super+0x29/0x60
  [359.288242]        cleanup_mnt+0x3b/0x70
  [359.289310]        task_work_run+0x98/0xc0
  [359.290428]        exit_to_usermode_loop+0x83/0x90
  [359.291445]        do_syscall_64+0x15b/0x180
  [359.292598]        entry_SYSCALL_64_after_hwframe+0x49/0xbe
  [359.294011]
  [359.294011] -> #2 (sb_internal#2){.+.+}:
  [359.295432]        __sb_start_write+0x113/0x1d0
  [359.296394]        start_transaction+0x369/0x500 [btrfs]
  [359.297471]        btrfs_finish_ordered_io+0x2aa/0x7c0 [btrfs]
  [359.298629]        normal_work_helper+0xcd/0x530 [btrfs]
  [359.299698]        process_one_work+0x246/0x610
  [359.300898]        worker_thread+0x3c/0x390
  [359.302020]        kthread+0x116/0x130
  [359.303053]        ret_from_fork+0x24/0x30
  [359.304152]
  [359.304152] -> #1 ((work_completion)(&work->normal_work)){+.+.}:
  [359.306100]        process_one_work+0x21f/0x610
  [359.307302]        worker_thread+0x3c/0x390
  [359.308465]        kthread+0x116/0x130
  [359.309357]        ret_from_fork+0x24/0x30
  [359.310229]
  [359.310229] -> #0 ((wq_completion)"%s-%s""btrfs", name){+.+.}:
  [359.311812]        lock_acquire+0x90/0x180
  [359.312929]        flush_workqueue+0xaa/0x540
  [359.313845]        drain_workqueue+0xa1/0x180
  [359.314761]        destroy_workqueue+0x17/0x240
  [359.315754]        btrfs_destroy_workqueue+0x57/0x200 [btrfs]
  [359.317245]        scrub_workers_put+0x2c/0x60 [btrfs]
  [359.318585]        btrfs_scrub_dev+0x336/0x590 [btrfs]
  [359.319944]        btrfs_dev_replace_by_ioctl.cold.19+0x179/0x1bb [btrfs]
  [359.321622]        btrfs_ioctl+0x28a4/0x2e40 [btrfs]
  [359.322908]        do_vfs_ioctl+0xa2/0x6d0
  [359.324021]        ksys_ioctl+0x3a/0x70
  [359.325066]        __x64_sys_ioctl+0x16/0x20
  [359.326236]        do_syscall_64+0x54/0x180
  [359.327379]        entry_SYSCALL_64_after_hwframe+0x49/0xbe
  [359.328772]
  [359.328772] other info that might help us debug this:
  [359.328772]
  [359.330990] Chain exists of:
  [359.330990]   (wq_completion)"%s-%s""btrfs", name --> sb_internal#2 --> &fs_info->scrub_lock
  [359.330990]
  [359.334376]  Possible unsafe locking scenario:
  [359.334376]
  [359.336020]        CPU0                    CPU1
  [359.337070]        ----                    ----
  [359.337821]   lock(&fs_info->scrub_lock);
  [359.338506]                                lock(sb_internal#2);
  [359.339506]                                lock(&fs_info->scrub_lock);
  [359.341461]   lock((wq_completion)"%s-%s""btrfs", name);
  [359.342437]
  [359.342437]  *** DEADLOCK ***
  [359.342437]
  [359.343745] 1 lock held by btrfs/20975:
  [359.344788]  #0: 0000000053ea26a6 (&fs_info->scrub_lock){+.+.}, at: btrfs_scrub_dev+0x322/0x590 [btrfs]
  [359.346778]
  [359.346778] stack backtrace:
  [359.347897] CPU: 0 PID: 20975 Comm: btrfs Not tainted 5.0.0-rc6-default #461
  [359.348983] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.11.2-0-gf9626cc-prebuilt.qemu-project.org 04/01/2014
  [359.350501] Call Trace:
  [359.350931]  dump_stack+0x67/0x90
  [359.351676]  print_circular_bug.isra.37.cold.56+0x15c/0x195
  [359.353569]  check_prev_add.constprop.44+0x4f9/0x750
  [359.354849]  ? check_prev_add.constprop.44+0x286/0x750
  [359.356505]  __lock_acquire+0xb84/0xf10
  [359.357505]  lock_acquire+0x90/0x180
  [359.358271]  ? flush_workqueue+0x87/0x540
  [359.359098]  flush_workqueue+0xaa/0x540
  [359.359912]  ? flush_workqueue+0x87/0x540
  [359.360740]  ? drain_workqueue+0x1e/0x180
  [359.361565]  ? drain_workqueue+0xa1/0x180
  [359.362391]  drain_workqueue+0xa1/0x180
  [359.363193]  destroy_workqueue+0x17/0x240
  [359.364539]  btrfs_destroy_workqueue+0x57/0x200 [btrfs]
  [359.365673]  scrub_workers_put+0x2c/0x60 [btrfs]
  [359.366618]  btrfs_scrub_dev+0x336/0x590 [btrfs]
  [359.367594]  ? start_transaction+0xa1/0x500 [btrfs]
  [359.368679]  btrfs_dev_replace_by_ioctl.cold.19+0x179/0x1bb [btrfs]
  [359.369545]  btrfs_ioctl+0x28a4/0x2e40 [btrfs]
  [359.370186]  ? __lock_acquire+0x263/0xf10
  [359.370777]  ? kvm_clock_read+0x14/0x30
  [359.371392]  ? kvm_sched_clock_read+0x5/0x10
  [359.372248]  ? sched_clock+0x5/0x10
  [359.372786]  ? sched_clock_cpu+0xc/0xc0
  [359.373662]  ? do_vfs_ioctl+0xa2/0x6d0
  [359.374552]  do_vfs_ioctl+0xa2/0x6d0
  [359.375378]  ? do_sigaction+0xff/0x250
  [359.376233]  ksys_ioctl+0x3a/0x70
  [359.376954]  __x64_sys_ioctl+0x16/0x20
  [359.377772]  do_syscall_64+0x54/0x180
  [359.378841]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
  [359.380422] RIP: 0033:0x7f5429296a97

Backporting to older kernels: scrub_nocow_workers must be freed the same
way as the others.

CC: stable@vger.kernel.org # 4.4+
Signed-off-by: Anand Jain <anand.jain@oracle.com>
[ update changelog ]
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-09-16 08:22:08 +02:00
David Sterba
ff55333f5c btrfs: scrub: move scrub_setup_ctx allocation out of device_list_mutex
[ Upstream commit 0e94c4f45d ]

The scrub context is allocated with GFP_KERNEL and called from
btrfs_scrub_dev under the fs_info::device_list_mutex. This is not safe
regarding reclaim that could try to flush filesystem data in order to
get the memory. And the device_list_mutex is held during superblock
commit, so this would cause a lockup.

Move the alocation and initialization before any changes that require
the mutex.

Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-09-16 08:22:07 +02:00
David Sterba
8ba3169dce btrfs: scrub: pass fs_info to scrub_setup_ctx
[ Upstream commit 92f7ba434f ]

We can pass fs_info directly as this is the only member of btrfs_device
that's bing used inside scrub_setup_ctx.

Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-09-16 08:22:07 +02:00
Qu Wenruo
be77686f03 btrfs: Use real device structure to verify dev extent
[ Upstream commit 1b3922a8bc ]

[BUG]
Linux v5.0-rc1 will fail fstests/btrfs/163 with the following kernel
message:

  BTRFS error (device dm-6): dev extent devid 1 physical offset 13631488 len 8388608 is beyond device boundary 0
  BTRFS error (device dm-6): failed to verify dev extents against chunks: -117
  BTRFS error (device dm-6): open_ctree failed

[CAUSE]
Commit cf90d884b3 ("btrfs: Introduce mount time chunk <-> dev extent
mapping check") introduced strict check on dev extents.

We use btrfs_find_device() with dev uuid and fs uuid set to NULL, and
only dependent on @devid to find the real device.

For seed devices, we call clone_fs_devices() in open_seed_devices() to
allow us search seed devices directly.

However clone_fs_devices() just populates devices with devid and dev
uuid, without populating other essential members, like disk_total_bytes.

This makes any device returned by btrfs_find_device(fs_info, devid,
NULL, NULL) is just a dummy, with 0 disk_total_bytes, and any dev
extents on the seed device will not pass the device boundary check.

[FIX]
This patch will try to verify the device returned by btrfs_find_device()
and if it's a dummy then re-search in seed devices.

Fixes: cf90d884b3 ("btrfs: Introduce mount time chunk <-> dev extent mapping check")
CC: stable@vger.kernel.org # 4.19+
Reported-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-09-16 08:22:01 +02:00
Qu Wenruo
a2790b9939 btrfs: volumes: Make sure no dev extent is beyond device boundary
[ Upstream commit 05a37c4860 ]

Add extra dev extent end check against device boundary.

Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-09-16 08:22:01 +02:00
Nikolay Borisov
eb124aaa2e btrfs: Fix error handling in btrfs_cleanup_ordered_extents
[ Upstream commit d1051d6ebf ]

Running btrfs/124 in a loop hung up on me sporadically with the
following call trace:

	btrfs           D    0  5760   5324 0x00000000
	Call Trace:
	 ? __schedule+0x243/0x800
	 schedule+0x33/0x90
	 btrfs_start_ordered_extent+0x10c/0x1b0 [btrfs]
	 ? wait_woken+0xa0/0xa0
	 btrfs_wait_ordered_range+0xbb/0x100 [btrfs]
	 btrfs_relocate_block_group+0x1ff/0x230 [btrfs]
	 btrfs_relocate_chunk+0x49/0x100 [btrfs]
	 btrfs_balance+0xbeb/0x1740 [btrfs]
	 btrfs_ioctl_balance+0x2ee/0x380 [btrfs]
	 btrfs_ioctl+0x1691/0x3110 [btrfs]
	 ? lockdep_hardirqs_on+0xed/0x180
	 ? __handle_mm_fault+0x8e7/0xfb0
	 ? _raw_spin_unlock+0x24/0x30
	 ? __handle_mm_fault+0x8e7/0xfb0
	 ? do_vfs_ioctl+0xa5/0x6e0
	 ? btrfs_ioctl_get_supported_features+0x30/0x30 [btrfs]
	 do_vfs_ioctl+0xa5/0x6e0
	 ? entry_SYSCALL_64_after_hwframe+0x3e/0xbe
	 ksys_ioctl+0x3a/0x70
	 __x64_sys_ioctl+0x16/0x20
	 do_syscall_64+0x60/0x1b0
	 entry_SYSCALL_64_after_hwframe+0x49/0xbe

This happens because during page writeback it's valid for
writepage_delalloc to instantiate a delalloc range which doesn't belong
to the page currently being written back.

The reason this case is valid is due to find_lock_delalloc_range
returning any available range after the passed delalloc_start and
ignoring whether the page under writeback is within that range.

In turn ordered extents (OE) are always created for the returned range
from find_lock_delalloc_range. If, however, a failure occurs while OE
are being created then the clean up code in btrfs_cleanup_ordered_extents
will be called.

Unfortunately the code in btrfs_cleanup_ordered_extents doesn't consider
the case of such 'foreign' range being processed and instead it always
assumes that the range OE are created for belongs to the page. This
leads to the first page of such foregin range to not be cleaned up since
it's deliberately missed and skipped by the current cleaning up code.

Fix this by correctly checking whether the current page belongs to the
range being instantiated and if so adjsut the range parameters passed
for cleaning up. If it doesn't, then just clean the whole OE range
directly.

Fixes: 524272607e ("btrfs: Handle delalloc error correctly to avoid ordered extent hang")
CC: stable@vger.kernel.org # 4.14+
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-09-16 08:21:59 +02:00
Nikolay Borisov
1669d1d2e6 btrfs: Remove extent_io_ops::fill_delalloc
[ Upstream commit 5eaad97af8 ]

This callback is called only from writepage_delalloc which in turn is
guaranteed to be called from the data page writeout path. In the end
there is no reason to have the call to this function to be indrected via
the extent_io_ops structure. This patch removes the callback definition,
exports the function and calls it directly. No functional changes.

Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
[ rename to btrfs_run_delalloc_range ]
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-09-16 08:21:59 +02:00
Filipe Manana
338a528b79 Btrfs: fix deadlock with memory reclaim during scrub
[ Upstream commit a5fb114291 ]

When a transaction commit starts, it attempts to pause scrub and it blocks
until the scrub is paused. So while the transaction is blocked waiting for
scrub to pause, we can not do memory allocation with GFP_KERNEL from scrub,
otherwise we risk getting into a deadlock with reclaim.

Checking for scrub pause requests is done early at the beginning of the
while loop of scrub_stripe() and later in the loop, scrub_extent() and
scrub_raid56_parity() are called, which in turn call scrub_pages() and
scrub_pages_for_parity() respectively. These last two functions do memory
allocations using GFP_KERNEL. Same problem could happen while scrubbing
the super blocks, since it calls scrub_pages().

We also can not have any of the worker tasks, created by the scrub task,
doing GFP_KERNEL allocations, because before pausing, the scrub task waits
for all the worker tasks to complete (also done at scrub_stripe()).

So make sure GFP_NOFS is used for the memory allocations because at any
time a scrub pause request can happen from another task that started to
commit a transaction.

Fixes: 58c4e17384 ("btrfs: scrub: use GFP_KERNEL on the submission path")
CC: stable@vger.kernel.org # 4.6+
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-09-16 08:21:59 +02:00
Omar Sandoval
fac803479f Btrfs: clean up scrub is_dev_replace parameter
[ Upstream commit 3293428096 ]

struct scrub_ctx has an ->is_dev_replace member, so there's no point in
passing around is_dev_replace where sctx is available.

Signed-off-by: Omar Sandoval <osandov@fb.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-09-16 08:21:58 +02:00
Greg Kroah-Hartman
5fc4dfdee6 This is the 4.19.72 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAl13bgIACgkQONu9yGCS
 aT5lvg/+KvPsSPQiccNrK/mvPan29KMk7R8Yjr3hzjPxLFPvRz7tR69joU6yj1nw
 S4BNZYiVsT43d9U8FieniAz2ch2qDIQbIIrQTLqcED4vn7ih0cTt4783FNZPApsZ
 I84u6kgbD3x5tPyB/RYE79JCq4JDgV+GvyDK+MXcU8Dh5OHTjJzLrwqghbI/LH6k
 G9yoMSho23i1h+jl/JW+QfknarDaclZUCmL/BdZ3A6mILXpWcwF2j10KHBYTsPRd
 DvrbW9S7e78c80TIWPtCHr1RYY79J4NXvPPThpytvpOjMZYsdT3s7jJLoiV4+3gq
 G04Y2awbfyD8U3WB3q8qX9AMfF5B6uIOqnp3DRV/NX7LJHk3xBTTAY3/6d7xAZr7
 xJ9WSd4gzzpBu6EgcS8F52qrrxol2S8jGRoNh7zGgQe8YvxvW5ktgsOCshzvGGSb
 HgYBar4qiQepG4mwKpJ9FCzBY/H+uuIKOgJpgFGE/Hhmw4yw5REy0i62sf3Rkroa
 db1j+5egc0kvqLUVGp8lS5RPDrhJQ9A4XIUlThZWU2bGVl6wDFHaJyZdM1oEXFZU
 jgONQnM/ys9jAnzc9xzJ4Ck9u03qquDgMmC+bC1z2OlQ/tN+z5Sm4U86DjXd9qBD
 jcb9IaSZ9HEgwIzhJypQHHNjr7OVINmrQW5v74LHNb3u+jIiYiM=
 =59b4
 -----END PGP SIGNATURE-----

Merge 4.19.72 into android-4.19

Changes in 4.19.72
	mld: fix memory leak in mld_del_delrec()
	net: fix skb use after free in netpoll
	net: sched: act_sample: fix psample group handling on overwrite
	net_sched: fix a NULL pointer deref in ipt action
	net: stmmac: dwmac-rk: Don't fail if phy regulator is absent
	tcp: inherit timestamp on mtu probe
	tcp: remove empty skb from write queue in error cases
	net/rds: Fix info leak in rds6_inc_info_copy()
	x86/boot: Preserve boot_params.secure_boot from sanitizing
	spi: bcm2835aux: unifying code between polling and interrupt driven code
	spi: bcm2835aux: remove dangerous uncontrolled read of fifo
	spi: bcm2835aux: fix corruptions for longer spi transfers
	net: tundra: tsi108: use spin_lock_irqsave instead of spin_lock_irq in IRQ context
	netfilter: nf_tables: use-after-free in failing rule with bound set
	tools: bpftool: fix error message (prog -> object)
	hv_netvsc: Fix a warning of suspicious RCU usage
	net: tc35815: Explicitly check NET_IP_ALIGN is not zero in tc35815_rx
	Bluetooth: btqca: Add a short delay before downloading the NVM
	ibmveth: Convert multicast list size for little-endian system
	gpio: Fix build error of function redefinition
	netfilter: nft_flow_offload: skip tcp rst and fin packets
	drm/mediatek: use correct device to import PRIME buffers
	drm/mediatek: set DMA max segment size
	scsi: qla2xxx: Fix gnl.l memory leak on adapter init failure
	scsi: target: tcmu: avoid use-after-free after command timeout
	cxgb4: fix a memory leak bug
	liquidio: add cleanup in octeon_setup_iq()
	net: myri10ge: fix memory leaks
	lan78xx: Fix memory leaks
	vfs: fix page locking deadlocks when deduping files
	cx82310_eth: fix a memory leak bug
	net: kalmia: fix memory leaks
	ibmvnic: Unmap DMA address of TX descriptor buffers after use
	net: cavium: fix driver name
	wimax/i2400m: fix a memory leak bug
	ravb: Fix use-after-free ravb_tstamp_skb
	kprobes: Fix potential deadlock in kprobe_optimizer()
	HID: cp2112: prevent sleeping function called from invalid context
	x86/boot/compressed/64: Fix boot on machines with broken E820 table
	Input: hyperv-keyboard: Use in-place iterator API in the channel callback
	Tools: hv: kvp: eliminate 'may be used uninitialized' warning
	nvme-multipath: fix possible I/O hang when paths are updated
	IB/mlx4: Fix memory leaks
	infiniband: hfi1: fix a memory leak bug
	infiniband: hfi1: fix memory leaks
	selftests: kvm: fix state save/load on processors without XSAVE
	selftests/kvm: make platform_info_test pass on AMD
	ceph: fix buffer free while holding i_ceph_lock in __ceph_setxattr()
	ceph: fix buffer free while holding i_ceph_lock in __ceph_build_xattrs_blob()
	ceph: fix buffer free while holding i_ceph_lock in fill_inode()
	KVM: arm/arm64: Only skip MMIO insn once
	afs: Fix leak in afs_lookup_cell_rcu()
	KVM: arm/arm64: VGIC: Properly initialise private IRQ affinity
	x86/boot/compressed/64: Fix missing initialization in find_trampoline_placement()
	libceph: allow ceph_buffer_put() to receive a NULL ceph_buffer
	Revert "x86/apic: Include the LDR when clearing out APIC registers"
	Linux 4.19.72

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I207e2ba522f98f6e17dacd3a2527fab615155007
2019-09-10 10:48:21 +01:00
David Howells
1a31b0d0dd afs: Fix leak in afs_lookup_cell_rcu()
[ Upstream commit a5fb8e6c02 ]

Fix a leak on the cell refcount in afs_lookup_cell_rcu() due to
non-clearance of the default error in the case a NULL cell name is passed
and the workstation default cell is used.

Also put a bit at the end to make sure we don't leak a cell ref if we're
going to be returning an error.

This leak results in an assertion like the following when the kafs module is
unloaded:

	AFS: Assertion failed
	2 == 1 is false
	0x2 == 0x1 is false
	------------[ cut here ]------------
	kernel BUG at fs/afs/cell.c:770!
	...
	RIP: 0010:afs_manage_cells+0x220/0x42f [kafs]
	...
	 process_one_work+0x4c2/0x82c
	 ? pool_mayday_timeout+0x1e1/0x1e1
	 ? do_raw_spin_lock+0x134/0x175
	 worker_thread+0x336/0x4a6
	 ? rescuer_thread+0x4af/0x4af
	 kthread+0x1de/0x1ee
	 ? kthread_park+0xd4/0xd4
	 ret_from_fork+0x24/0x30

Fixes: 989782dcdc ("afs: Overhaul cell database management")
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-09-10 10:33:52 +01:00
Luis Henriques
b84817d96e ceph: fix buffer free while holding i_ceph_lock in fill_inode()
[ Upstream commit af8a85a417 ]

Calling ceph_buffer_put() in fill_inode() may result in freeing the
i_xattrs.blob buffer while holding the i_ceph_lock.  This can be fixed by
postponing the call until later, when the lock is released.

The following backtrace was triggered by fstests generic/070.

  BUG: sleeping function called from invalid context at mm/vmalloc.c:2283
  in_atomic(): 1, irqs_disabled(): 0, pid: 3852, name: kworker/0:4
  6 locks held by kworker/0:4/3852:
   #0: 000000004270f6bb ((wq_completion)ceph-msgr){+.+.}, at: process_one_work+0x1b8/0x5f0
   #1: 00000000eb420803 ((work_completion)(&(&con->work)->work)){+.+.}, at: process_one_work+0x1b8/0x5f0
   #2: 00000000be1c53a4 (&s->s_mutex){+.+.}, at: dispatch+0x288/0x1476
   #3: 00000000559cb958 (&mdsc->snap_rwsem){++++}, at: dispatch+0x2eb/0x1476
   #4: 000000000d5ebbae (&req->r_fill_mutex){+.+.}, at: dispatch+0x2fc/0x1476
   #5: 00000000a83d0514 (&(&ci->i_ceph_lock)->rlock){+.+.}, at: fill_inode.isra.0+0xf8/0xf70
  CPU: 0 PID: 3852 Comm: kworker/0:4 Not tainted 5.2.0+ #441
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58-prebuilt.qemu.org 04/01/2014
  Workqueue: ceph-msgr ceph_con_workfn
  Call Trace:
   dump_stack+0x67/0x90
   ___might_sleep.cold+0x9f/0xb1
   vfree+0x4b/0x60
   ceph_buffer_release+0x1b/0x60
   fill_inode.isra.0+0xa9b/0xf70
   ceph_fill_trace+0x13b/0xc70
   ? dispatch+0x2eb/0x1476
   dispatch+0x320/0x1476
   ? __mutex_unlock_slowpath+0x4d/0x2a0
   ceph_con_workfn+0xc97/0x2ec0
   ? process_one_work+0x1b8/0x5f0
   process_one_work+0x244/0x5f0
   worker_thread+0x4d/0x3e0
   kthread+0x105/0x140
   ? process_one_work+0x5f0/0x5f0
   ? kthread_park+0x90/0x90
   ret_from_fork+0x3a/0x50

Signed-off-by: Luis Henriques <lhenriques@suse.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-09-10 10:33:52 +01:00
Luis Henriques
5cd1e3552f ceph: fix buffer free while holding i_ceph_lock in __ceph_build_xattrs_blob()
[ Upstream commit 12fe3dda7e ]

Calling ceph_buffer_put() in __ceph_build_xattrs_blob() may result in
freeing the i_xattrs.blob buffer while holding the i_ceph_lock.  This can
be fixed by having this function returning the old blob buffer and have
the callers of this function freeing it when the lock is released.

The following backtrace was triggered by fstests generic/117.

  BUG: sleeping function called from invalid context at mm/vmalloc.c:2283
  in_atomic(): 1, irqs_disabled(): 0, pid: 649, name: fsstress
  4 locks held by fsstress/649:
   #0: 00000000a7478e7e (&type->s_umount_key#19){++++}, at: iterate_supers+0x77/0xf0
   #1: 00000000f8de1423 (&(&ci->i_ceph_lock)->rlock){+.+.}, at: ceph_check_caps+0x7b/0xc60
   #2: 00000000562f2b27 (&s->s_mutex){+.+.}, at: ceph_check_caps+0x3bd/0xc60
   #3: 00000000f83ce16a (&mdsc->snap_rwsem){++++}, at: ceph_check_caps+0x3ed/0xc60
  CPU: 1 PID: 649 Comm: fsstress Not tainted 5.2.0+ #439
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58-prebuilt.qemu.org 04/01/2014
  Call Trace:
   dump_stack+0x67/0x90
   ___might_sleep.cold+0x9f/0xb1
   vfree+0x4b/0x60
   ceph_buffer_release+0x1b/0x60
   __ceph_build_xattrs_blob+0x12b/0x170
   __send_cap+0x302/0x540
   ? __lock_acquire+0x23c/0x1e40
   ? __mark_caps_flushing+0x15c/0x280
   ? _raw_spin_unlock+0x24/0x30
   ceph_check_caps+0x5f0/0xc60
   ceph_flush_dirty_caps+0x7c/0x150
   ? __ia32_sys_fdatasync+0x20/0x20
   ceph_sync_fs+0x5a/0x130
   iterate_supers+0x8f/0xf0
   ksys_sync+0x4f/0xb0
   __ia32_sys_sync+0xa/0x10
   do_syscall_64+0x50/0x1c0
   entry_SYSCALL_64_after_hwframe+0x49/0xbe
  RIP: 0033:0x7fc6409ab617

Signed-off-by: Luis Henriques <lhenriques@suse.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-09-10 10:33:52 +01:00
Luis Henriques
dfb8712c7a ceph: fix buffer free while holding i_ceph_lock in __ceph_setxattr()
[ Upstream commit 86968ef215 ]

Calling ceph_buffer_put() in __ceph_setxattr() may end up freeing the
i_xattrs.prealloc_blob buffer while holding the i_ceph_lock.  This can be
fixed by postponing the call until later, when the lock is released.

The following backtrace was triggered by fstests generic/117.

  BUG: sleeping function called from invalid context at mm/vmalloc.c:2283
  in_atomic(): 1, irqs_disabled(): 0, pid: 650, name: fsstress
  3 locks held by fsstress/650:
   #0: 00000000870a0fe8 (sb_writers#8){.+.+}, at: mnt_want_write+0x20/0x50
   #1: 00000000ba0c4c74 (&type->i_mutex_dir_key#6){++++}, at: vfs_setxattr+0x55/0xa0
   #2: 000000008dfbb3f2 (&(&ci->i_ceph_lock)->rlock){+.+.}, at: __ceph_setxattr+0x297/0x810
  CPU: 1 PID: 650 Comm: fsstress Not tainted 5.2.0+ #437
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58-prebuilt.qemu.org 04/01/2014
  Call Trace:
   dump_stack+0x67/0x90
   ___might_sleep.cold+0x9f/0xb1
   vfree+0x4b/0x60
   ceph_buffer_release+0x1b/0x60
   __ceph_setxattr+0x2b4/0x810
   __vfs_setxattr+0x66/0x80
   __vfs_setxattr_noperm+0x59/0xf0
   vfs_setxattr+0x81/0xa0
   setxattr+0x115/0x230
   ? filename_lookup+0xc9/0x140
   ? rcu_read_lock_sched_held+0x74/0x80
   ? rcu_sync_lockdep_assert+0x2e/0x60
   ? __sb_start_write+0x142/0x1a0
   ? mnt_want_write+0x20/0x50
   path_setxattr+0xba/0xd0
   __x64_sys_lsetxattr+0x24/0x30
   do_syscall_64+0x50/0x1c0
   entry_SYSCALL_64_after_hwframe+0x49/0xbe
  RIP: 0033:0x7ff23514359a

Signed-off-by: Luis Henriques <lhenriques@suse.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-09-10 10:33:51 +01:00
Darrick J. Wong
ac3cc25f38 vfs: fix page locking deadlocks when deduping files
[ Upstream commit edc58dd012 ]

When dedupe wants to use the page cache to compare parts of two files
for dedupe, we must be very careful to handle locking correctly.  The
current code doesn't do this.  It must lock and unlock the page only
once if the two pages are the same, since the overlapping range check
doesn't catch this when blocksize < pagesize.  If the pages are distinct
but from the same file, we must observe page locking order and lock them
in order of increasing offset to avoid clashing with writeback locking.

Fixes: 876bec6f9b ("vfs: refactor clone/dedupe_file_range common functions")
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Bill O'Donnell <billodo@redhat.com>
Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-09-10 10:33:47 +01:00
Greg Kroah-Hartman
6b1f307bd0 This is the 4.19.70 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAl1yF0AACgkQONu9yGCS
 aT64ew/6AzJDRMcmnx1COeRP8tfQ5A8ghjnp6REEca1MYJWjDlqd0X+EMd/7zorZ
 YkBzb1ND1c/9KeGQzdx8lJ5lcNRVJcD7tT6irT5zMnBcyR9uPamAaVgmVHXxuorK
 el4nvX9g/MxBLsuqLHxGr5pNXi7mHu6zfXyQ86TJzlez7yHlPuiJb8bDpHnCRJ1P
 n/MemPq/nQgC5jPBQRhT+IpqC1MTIHNhRaHHg/5Gdrrz+eVumnk+1zbWqtBRuJKS
 qS7RL1pI0Su00i5bY1r76iSkoRkGw9SoeIgz3sycbtAvGo9TI16/hZPvWAsYOAjL
 2DQS0rMPSnM4QV0odbUImFt86f0YJiAL7xYS8EYCc4GX/eLbNRtP8yLq+8rlo4Oa
 36HbiNhGSEjRxenfVRUD/STgBYzfVeQOMEyFJNRtfNDP/l66sLk/pEOEd83j06H4
 G87BJgKFC35dv5QrbCmJO8P1IXLs5QaChD3dL6R9/hbvCU2A/MOqhPL16JCXWA20
 +hOWn8ryrtBa5Dt0avAkrrnUNC8cVWyD44uAm+Hu/49CZpkTasZMB7Z81VSIsuvO
 xoT1Jx0J1W/LtHwCghSkui/fjQjVpkP3xnB7zVGem73Mpcm68g6mLajKaUrLnJHp
 /sz2mppF7gZob43anTtUhSV9OvrzqftkR+iFg3rCKSJyQE1o48o=
 =hMvg
 -----END PGP SIGNATURE-----

Merge 4.19.70 into android-4.19

Changes in 4.19.70
	dmaengine: ste_dma40: fix unneeded variable warning
	nvme-multipath: revalidate nvme_ns_head gendisk in nvme_validate_ns
	afs: Fix the CB.ProbeUuid service handler to reply correctly
	afs: Fix loop index mixup in afs_deliver_vl_get_entry_by_name_u()
	fs: afs: Fix a possible null-pointer dereference in afs_put_read()
	afs: Only update d_fsdata if different in afs_d_revalidate()
	nvmet-loop: Flush nvme_delete_wq when removing the port
	nvme: fix a possible deadlock when passthru commands sent to a multipath device
	nvme-pci: Fix async probe remove race
	soundwire: cadence_master: fix register definition for SLAVE_STATE
	soundwire: cadence_master: fix definitions for INTSTAT0/1
	auxdisplay: panel: need to delete scan_timer when misc_register fails in panel_attach
	dmaengine: stm32-mdma: Fix a possible null-pointer dereference in stm32_mdma_irq_handler()
	omap-dma/omap_vout_vrfb: fix off-by-one fi value
	iommu/dma: Handle SG length overflow better
	usb: gadget: composite: Clear "suspended" on reset/disconnect
	usb: gadget: mass_storage: Fix races between fsg_disable and fsg_set_alt
	xen/blkback: fix memory leaks
	arm64: cpufeature: Don't treat granule sizes as strict
	i2c: rcar: avoid race when unregistering slave client
	i2c: emev2: avoid race when unregistering slave client
	drm/ast: Fixed reboot test may cause system hanged
	usb: host: fotg2: restart hcd after port reset
	tools: hv: fixed Python pep8/flake8 warnings for lsvmbus
	tools: hv: fix KVP and VSS daemons exit code
	drm/i915: fix broadwell EU computation
	watchdog: bcm2835_wdt: Fix module autoload
	drm/bridge: tfp410: fix memleak in get_modes()
	scsi: ufs: Fix RX_TERMINATION_FORCE_ENABLE define value
	drm/tilcdc: Register cpufreq notifier after we have initialized crtc
	net/tls: Fixed return value when tls_complete_pending_work() fails
	net/tls: swap sk_write_space on close
	net: tls, fix sk_write_space NULL write when tx disabled
	ipv6/addrconf: allow adding multicast addr if IFA_F_MCAUTOJOIN is set
	ipv6: Default fib6_type to RTN_UNICAST when not set
	net/smc: make sure EPOLLOUT is raised
	tcp: make sure EPOLLOUT wont be missed
	ipv4/icmp: fix rt dst dev null pointer dereference
	mm/zsmalloc.c: fix build when CONFIG_COMPACTION=n
	ALSA: usb-audio: Check mixer unit bitmap yet more strictly
	ALSA: line6: Fix memory leak at line6_init_pcm() error path
	ALSA: hda - Fixes inverted Conexant GPIO mic mute led
	ALSA: seq: Fix potential concurrent access to the deleted pool
	ALSA: usb-audio: Fix invalid NULL check in snd_emuusb_set_samplerate()
	ALSA: usb-audio: Add implicit fb quirk for Behringer UFX1604
	kvm: x86: skip populating logical dest map if apic is not sw enabled
	KVM: x86: Don't update RIP or do single-step on faulting emulation
	uprobes/x86: Fix detection of 32-bit user mode
	x86/apic: Do not initialize LDR and DFR for bigsmp
	x86/apic: Include the LDR when clearing out APIC registers
	ftrace: Fix NULL pointer dereference in t_probe_next()
	ftrace: Check for successful allocation of hash
	ftrace: Check for empty hash and comment the race with registering probes
	usb-storage: Add new JMS567 revision to unusual_devs
	USB: cdc-wdm: fix race between write and disconnect due to flag abuse
	usb: hcd: use managed device resources
	usb: chipidea: udc: don't do hardware access if gadget has stopped
	usb: host: ohci: fix a race condition between shutdown and irq
	usb: host: xhci: rcar: Fix typo in compatible string matching
	USB: storage: ums-realtek: Update module parameter description for auto_delink_en
	USB: storage: ums-realtek: Whitelist auto-delink support
	mei: me: add Tiger Lake point LP device ID
	mmc: sdhci-of-at91: add quirk for broken HS200
	mmc: core: Fix init of SD cards reporting an invalid VDD range
	stm class: Fix a double free of stm_source_device
	intel_th: pci: Add support for another Lewisburg PCH
	intel_th: pci: Add Tiger Lake support
	typec: tcpm: fix a typo in the comparison of pdo_max_voltage
	fsi: scom: Don't abort operations for minor errors
	lib: logic_pio: Fix RCU usage
	lib: logic_pio: Avoid possible overlap for unregistering regions
	lib: logic_pio: Add logic_pio_unregister_range()
	drm/amdgpu: Add APTX quirk for Dell Latitude 5495
	drm/i915: Don't deballoon unused ggtt drm_mm_node in linux guest
	drm/i915: Call dma_set_max_seg_size() in i915_driver_hw_probe()
	bus: hisi_lpc: Unregister logical PIO range to avoid potential use-after-free
	bus: hisi_lpc: Add .remove method to avoid driver unbind crash
	VMCI: Release resource if the work is already queued
	crypto: ccp - Ignore unconfigured CCP device on suspend/resume
	Revert "cfg80211: fix processing world regdomain when non modular"
	mac80211: fix possible sta leak
	mac80211: Don't memset RXCB prior to PAE intercept
	mac80211: Correctly set noencrypt for PAE frames
	KVM: PPC: Book3S: Fix incorrect guest-to-user-translation error handling
	KVM: arm/arm64: vgic: Fix potential deadlock when ap_list is long
	KVM: arm/arm64: vgic-v2: Handle SGI bits in GICD_I{S,C}PENDR0 as WI
	NFS: Clean up list moves of struct nfs_page
	NFSv4/pnfs: Fix a page lock leak in nfs_pageio_resend()
	NFS: Pass error information to the pgio error cleanup routine
	NFS: Ensure O_DIRECT reports an error if the bytes read/written is 0
	i2c: piix4: Fix port selection for AMD Family 16h Model 30h
	x86/ptrace: fix up botched merge of spectrev1 fix
	mt76: mt76x0u: do not reset radio on resume
	Revert "ASoC: Fail card instantiation if DAI format setup fails"
	Linux 4.19.70

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I35ff8a403a05a8c66d87cb4b542997e63c422288
2019-09-06 12:08:42 +02:00
Trond Myklebust
4f4be79c9e NFS: Ensure O_DIRECT reports an error if the bytes read/written is 0
[ Upstream commit eb2c50da9e ]

If the attempt to resend the I/O results in no bytes being read/written,
we must ensure that we report the error.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Fixes: 0a00b77b33 ("nfs: mirroring support for direct io")
Cc: stable@vger.kernel.org # v3.20+
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-09-06 10:22:23 +02:00
Trond Myklebust
b5891b624b NFS: Pass error information to the pgio error cleanup routine
[ Upstream commit df3accb849 ]

Allow the caller to pass error information when cleaning up a failed
I/O request so that we can conditionally take action to cancel the
request altogether if the error turned out to be fatal.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-09-06 10:22:23 +02:00
Trond Myklebust
812de6dee5 NFSv4/pnfs: Fix a page lock leak in nfs_pageio_resend()
[ Upstream commit f4340e9314 ]

If the attempt to resend the pages fails, we need to ensure that we
clean up those pages that were not transmitted.

Fixes: d600ad1f2b ("NFS41: pop some layoutget errors to application")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Cc: stable@vger.kernel.org # v4.5+
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-09-06 10:22:22 +02:00
Trond Myklebust
57c491fd84 NFS: Clean up list moves of struct nfs_page
[ Upstream commit 078b5fd92c ]

In several places we're just moving the struct nfs_page from one list to
another by first removing from the existing list, then adding to the new
one.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-09-06 10:22:22 +02:00
David Howells
9c55dc85d8 afs: Only update d_fsdata if different in afs_d_revalidate()
[ Upstream commit 5dc84855b0 ]

In the in-kernel afs filesystem, d_fsdata is set with the data version of
the parent directory.  afs_d_revalidate() will update this to the current
directory version, but it shouldn't do this if it the value it read from
d_fsdata is the same as no lock is held and cmpxchg() is not used.

Fix the code to only change the value if it is different from the current
directory version.

Fixes: 260a980317 ("[AFS]: Add "directory write" support.")
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-09-06 10:21:55 +02:00
Jia-Ju Bai
24e093b969 fs: afs: Fix a possible null-pointer dereference in afs_put_read()
[ Upstream commit a6eed4ab5d ]

In afs_read_dir(), there is an if statement on line 255 to check whether
req->pages is NULL:
	if (!req->pages)
		goto error;

If req->pages is NULL, afs_put_read() on line 337 is executed.
In afs_put_read(), req->pages[i] is used on line 195.
Thus, a possible null-pointer dereference may occur in this case.

To fix this possible bug, an if statement is added in afs_put_read() to
check req->pages.

This bug is found by a static analysis tool STCheck written by us.

Fixes: f3ddee8dc4 ("afs: Fix directory handling")
Signed-off-by: Jia-Ju Bai <baijiaju1990@gmail.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-09-06 10:21:54 +02:00
Marc Dionne
8e5179f982 afs: Fix loop index mixup in afs_deliver_vl_get_entry_by_name_u()
[ Upstream commit 4a46fdba44 ]

afs_deliver_vl_get_entry_by_name_u() scans through the vl entry
received from the volume location server and builds a return list
containing the sites that are currently valid.  When assigning
values for the return list, the index into the vl entry (i) is used
rather than the one for the new list (entry->nr_server).  If all
sites are usable, this works out fine as the indices will match.
If some sites are not valid, for example if AFS_VLSF_DONTUSE is
set, fs_mask and the uuid will be set for the wrong return site.

Fix this by using entry->nr_server as the index into the arrays
being filled in rather than i.

This can lead to EDESTADDRREQ errors if none of the returned sites
have a valid fs_mask.

Fixes: d2ddc776a4 ("afs: Overhaul volume and server record caching and fileserver rotation")
Signed-off-by: Marc Dionne <marc.dionne@auristor.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Jeffrey Altman <jaltman@auristor.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-09-06 10:21:54 +02:00
David Howells
dfc438c0bc afs: Fix the CB.ProbeUuid service handler to reply correctly
[ Upstream commit 2067b2b3f4 ]

Fix the service handler function for the CB.ProbeUuid RPC call so that it
replies in the correct manner - that is an empty reply for success and an
abort of 1 for failure.

Putting 0 or 1 in an integer in the body of the reply should result in the
fileserver throwing an RX_PROTOCOL_ERROR abort and discarding its record of
the client; older servers, however, don't necessarily check that all the
data got consumed, and so might incorrectly think that they got a positive
response and associate the client with the wrong host record.

If the client is incorrectly associated, this will result in callbacks
intended for a different client being delivered to this one and then, when
the other client connects and responds positively, all of the callback
promises meant for the client that issued the improper response will be
lost and it won't receive any further change notifications.

Fixes: 9396d496d7 ("afs: support the CB.ProbeUuid RPC op")
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Jeffrey Altman <jaltman@auristor.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-09-06 10:21:54 +02:00
Greg Kroah-Hartman
12dc90c620 This is the 4.19.69 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAl1ncKwACgkQONu9yGCS
 aT6FPg//RiiJo8O+CUzkP4MFohy8JUuGC1MnnSfSFJn9bzAljWhYtSoJlZ9PbfHq
 qx9oWuQNNVZRn9nWuRbTRfRlz6ztc7whsjhAth4eNCtXvu+xAvFLvFhlbVt6xiZ0
 Wg3jXDtIY3Y8km01uJdzVk/juUqvTU8nioM4s1OWTFRfOfakLMK9CkxOKfZMFnxP
 mVILTcOxZAf0Js2tRMRPvm8c6OhegkXZjWUhGMlvmFKk/pqUouVXH8pKbBoTj8zR
 VoHB6pWs3YG3S15WgkNfKiR9WpeXywC7XN9ilziczaQ8HbsH6Y+5wM9Ncx+3FVxd
 mzygLCtlckYWjiabS/w3tQHrH+LV8MaPYuW/2tlL9sBljlDW5RBW+g7vaBQjIpCK
 gco/z0qmeEIYt8ktLL08i9FQBPp00Fra9x3jZKLz8Tp+W//EBm4ENMm2cxHtRKtd
 3fG70ngJmycksCK/e8N466/1f/aAOxfBZkog3R/4yqNvOX8rkQGRJlhv0AzIdsy8
 RlTDotDwwQFbMWROVs/Jea+9Wwp71jlPOXyMqX2EqUR/WfDWuIZEyzSdbqKPNgHL
 Q9OoUB7kIKGusqQC6ABYKtDn7T046o6ePEB8r2aIvPmw5AiWMefSubHNV6mP3KJb
 0NbQlUelMTvxuwm5NLMY2EWbWi+jvg4OgJvajwcDretTPeBGMZI=
 =945Y
 -----END PGP SIGNATURE-----

Merge 4.19.69 into android-4.19

Changes in 4.19.69
	HID: Add 044f:b320 ThrustMaster, Inc. 2 in 1 DT
	MIPS: kernel: only use i8253 clocksource with periodic clockevent
	mips: fix cacheinfo
	netfilter: ebtables: fix a memory leak bug in compat
	ASoC: dapm: Fix handling of custom_stop_condition on DAPM graph walks
	selftests/bpf: fix sendmsg6_prog on s390
	bonding: Force slave speed check after link state recovery for 802.3ad
	net: mvpp2: Don't check for 3 consecutive Idle frames for 10G links
	selftests: forwarding: gre_multipath: Enable IPv4 forwarding
	selftests: forwarding: gre_multipath: Fix flower filters
	can: dev: call netif_carrier_off() in register_candev()
	can: mcp251x: add error check when wq alloc failed
	can: gw: Fix error path of cgw_module_init
	ASoC: Fail card instantiation if DAI format setup fails
	st21nfca_connectivity_event_received: null check the allocation
	st_nci_hci_connectivity_event_received: null check the allocation
	ASoC: rockchip: Fix mono capture
	ASoC: ti: davinci-mcasp: Correct slot_width posed constraint
	net: usb: qmi_wwan: Add the BroadMobi BM818 card
	qed: RDMA - Fix the hw_ver returned in device attributes
	isdn: mISDN: hfcsusb: Fix possible null-pointer dereferences in start_isoc_chain()
	mac80211_hwsim: Fix possible null-pointer dereferences in hwsim_dump_radio_nl()
	netfilter: ipset: Actually allow destination MAC address for hash:ip,mac sets too
	netfilter: ipset: Copy the right MAC address in bitmap:ip,mac and hash:ip,mac sets
	netfilter: ipset: Fix rename concurrency with listing
	rxrpc: Fix potential deadlock
	rxrpc: Fix the lack of notification when sendmsg() fails on a DATA packet
	isdn: hfcsusb: Fix mISDN driver crash caused by transfer buffer on the stack
	net: phy: phy_led_triggers: Fix a possible null-pointer dereference in phy_led_trigger_change_speed()
	perf bench numa: Fix cpu0 binding
	can: sja1000: force the string buffer NULL-terminated
	can: peak_usb: force the string buffer NULL-terminated
	net/ethernet/qlogic/qed: force the string buffer NULL-terminated
	NFSv4: Fix a potential sleep while atomic in nfs4_do_reclaim()
	NFS: Fix regression whereby fscache errors are appearing on 'nofsc' mounts
	HID: quirks: Set the INCREMENT_USAGE_ON_DUPLICATE quirk on Saitek X52
	HID: input: fix a4tech horizontal wheel custom usage
	drm/rockchip: Suspend DP late
	SMB3: Fix potential memory leak when processing compound chain
	SMB3: Kernel oops mounting a encryptData share with CONFIG_DEBUG_VIRTUAL
	s390: put _stext and _etext into .text section
	net: cxgb3_main: Fix a resource leak in a error path in 'init_one()'
	net: stmmac: Fix issues when number of Queues >= 4
	net: stmmac: tc: Do not return a fragment entry
	net: hisilicon: make hip04_tx_reclaim non-reentrant
	net: hisilicon: fix hip04-xmit never return TX_BUSY
	net: hisilicon: Fix dma_map_single failed on arm64
	libata: have ata_scsi_rw_xlat() fail invalid passthrough requests
	libata: add SG safety checks in SFF pio transfers
	x86/lib/cpu: Address missing prototypes warning
	drm/vmwgfx: fix memory leak when too many retries have occurred
	block, bfq: handle NULL return value by bfq_init_rq()
	perf ftrace: Fix failure to set cpumask when only one cpu is present
	perf cpumap: Fix writing to illegal memory in handling cpumap mask
	perf pmu-events: Fix missing "cpu_clk_unhalted.core" event
	KVM: arm64: Don't write junk to sysregs on reset
	KVM: arm: Don't write junk to CP15 registers on reset
	selftests: kvm: Adding config fragments
	HID: wacom: correct misreported EKR ring values
	HID: wacom: Correct distance scale for 2nd-gen Intuos devices
	Revert "dm bufio: fix deadlock with loop device"
	clk: socfpga: stratix10: fix rate caclulationg for cnt_clks
	ceph: clear page dirty before invalidate page
	ceph: don't try fill file_lock on unsuccessful GETFILELOCK reply
	libceph: fix PG split vs OSD (re)connect race
	drm/nouveau: Don't retry infinitely when receiving no data on i2c over AUX
	gpiolib: never report open-drain/source lines as 'input' to user-space
	Drivers: hv: vmbus: Fix virt_to_hvpfn() for X86_PAE
	userfaultfd_release: always remove uffd flags and clear vm_userfaultfd_ctx
	x86/retpoline: Don't clobber RFLAGS during CALL_NOSPEC on i386
	x86/apic: Handle missing global clockevent gracefully
	x86/CPU/AMD: Clear RDRAND CPUID bit on AMD family 15h/16h
	x86/boot: Save fields explicitly, zero out everything else
	x86/boot: Fix boot regression caused by bootparam sanitizing
	dm kcopyd: always complete failed jobs
	dm btree: fix order of block initialization in btree_split_beneath
	dm integrity: fix a crash due to BUG_ON in __journal_read_write()
	dm raid: add missing cleanup in raid_ctr()
	dm space map metadata: fix missing store of apply_bops() return value
	dm table: fix invalid memory accesses with too high sector number
	dm zoned: improve error handling in reclaim
	dm zoned: improve error handling in i/o map code
	dm zoned: properly handle backing device failure
	genirq: Properly pair kobject_del() with kobject_add()
	mm, page_owner: handle THP splits correctly
	mm/zsmalloc.c: migration can leave pages in ZS_EMPTY indefinitely
	mm/zsmalloc.c: fix race condition in zs_destroy_pool
	xfs: fix missing ILOCK unlock when xfs_setattr_nonsize fails due to EDQUOT
	xfs: don't trip over uninitialized buffer on extent read of corrupted inode
	xfs: Move fs/xfs/xfs_attr.h to fs/xfs/libxfs/xfs_attr.h
	xfs: Add helper function xfs_attr_try_sf_addname
	xfs: Add attibute set and helper functions
	xfs: Add attibute remove and helper functions
	xfs: always rejoin held resources during defer roll
	dm zoned: fix potential NULL dereference in dmz_do_reclaim()
	powerpc: Allow flush_(inval_)dcache_range to work across ranges >4GB
	rxrpc: Fix local endpoint refcounting
	rxrpc: Fix read-after-free in rxrpc_queue_local()
	rxrpc: Fix local endpoint replacement
	rxrpc: Fix local refcounting
	Linux 4.19.69

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I9824a29e0434a6a80e2f32fdb88c0ac1fe8e5af5
2019-09-02 17:39:29 +02:00
Darrick J. Wong
655bb2c4ac xfs: always rejoin held resources during defer roll
commit 710d707d2f upstream.

During testing of xfs/141 on a V4 filesystem, I observed some
inconsistent behavior with regards to resources that are held (i.e.
remain locked) across a defer roll.  The transaction roll always gives
the defer roll function a new transaction, even if committing the old
transaction fails.  However, the defer roll function only rejoins the
held resources if the transaction commit succeedied.  This means that
callers of defer roll have to figure out whether the held resources are
attached to the transaction being passed back.

Worse yet, if the defer roll was part of a defer finish call, we have a
third possibility: the defer finish could pass back a dirty transaction
with dirty held resources and an error code.

The only sane way to handle all of these scenarios is to require that
the code that held the resource either cancel the transaction before
unlocking and releasing the resources, or use functions that detach
resources from a transaction properly (e.g.  xfs_trans_brelse) if they
need to drop the reference before committing or cancelling the
transaction.

In order to make this so, change the defer roll code to join held
resources to the new transaction unconditionally and fix all the bhold
callers to release the held buffers correctly.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
[mcgrof: fixes kz#204223 ]
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-08-29 08:28:58 +02:00
Allison Henderson
83a8e6b2f2 xfs: Add attibute remove and helper functions
commit 068f985a9e upstream.

This patch adds xfs_attr_remove_args. These sub-routines remove
the attributes specified in @args. We will use this later for setting
parent pointers as a deferred attribute operation.

Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-08-29 08:28:58 +02:00
Allison Henderson
b21ff6cfcc xfs: Add attibute set and helper functions
commit 2f3cd80919 upstream.

This patch adds xfs_attr_set_args and xfs_bmap_set_attrforkoff.
These sub-routines set the attributes specified in @args.
We will use this later for setting parent pointers as a deferred
attribute operation.

[dgc: remove attr fork init code from xfs_attr_set_args().]
[dgc: xfs_attr_try_sf_addname() NULLs args.trans after commit.]
[dgc: correct sf add error handling.]

Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-08-29 08:28:58 +02:00
Allison Henderson
b3a248f230 xfs: Add helper function xfs_attr_try_sf_addname
commit 4c74a56b9d upstream.

This patch adds a subroutine xfs_attr_try_sf_addname
used by xfs_attr_set.  This subrotine will attempt to
add the attribute name specified in args in shortform,
as well and perform error handling previously done in
xfs_attr_set.

This patch helps to pre-simplify xfs_attr_set for reviewing
purposes and reduce indentation.  New function will be added
in the next patch.

[dgc: moved commit to helper function, too.]

Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-08-29 08:28:58 +02:00
Allison Henderson
a9912f346b xfs: Move fs/xfs/xfs_attr.h to fs/xfs/libxfs/xfs_attr.h
commit e2421f0b5f upstream.

This patch moves fs/xfs/xfs_attr.h to fs/xfs/libxfs/xfs_attr.h
since xfs_attr.c is in libxfs.  We will need these later in
xfsprogs.

Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-08-29 08:28:58 +02:00
Brian Foster
17c2b7af71 xfs: don't trip over uninitialized buffer on extent read of corrupted inode
commit 6958d11f77 upstream.

We've had rather rare reports of bmap btree block corruption where
the bmap root block has a level count of zero. The root cause of the
corruption is so far unknown. We do have verifier checks to detect
this form of on-disk corruption, but this doesn't cover a memory
corruption variant of the problem. The latter is a reasonable
possibility because the root block is part of the inode fork and can
reside in-core for some time before inode extents are read.

If this occurs, it leads to a system crash such as the following:

 BUG: unable to handle kernel paging request at ffffffff00000221
 PF error: [normal kernel read fault]
 ...
 RIP: 0010:xfs_trans_brelse+0xf/0x200 [xfs]
 ...
 Call Trace:
  xfs_iread_extents+0x379/0x540 [xfs]
  xfs_file_iomap_begin_delay+0x11a/0xb40 [xfs]
  ? xfs_attr_get+0xd1/0x120 [xfs]
  ? iomap_write_begin.constprop.40+0x2d0/0x2d0
  xfs_file_iomap_begin+0x4c4/0x6d0 [xfs]
  ? __vfs_getxattr+0x53/0x70
  ? iomap_write_begin.constprop.40+0x2d0/0x2d0
  iomap_apply+0x63/0x130
  ? iomap_write_begin.constprop.40+0x2d0/0x2d0
  iomap_file_buffered_write+0x62/0x90
  ? iomap_write_begin.constprop.40+0x2d0/0x2d0
  xfs_file_buffered_aio_write+0xe4/0x3b0 [xfs]
  __vfs_write+0x150/0x1b0
  vfs_write+0xba/0x1c0
  ksys_pwrite64+0x64/0xa0
  do_syscall_64+0x5a/0x1d0
  entry_SYSCALL_64_after_hwframe+0x49/0xbe

The crash occurs because xfs_iread_extents() attempts to release an
uninitialized buffer pointer as the level == 0 value prevented the
buffer from ever being allocated or read. Change the level > 0
assert to an explicit error check in xfs_iread_extents() to avoid
crashing the kernel in the event of localized, in-core inode
corruption.

Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-08-29 08:28:57 +02:00
Darrick J. Wong
11f85d4d77 xfs: fix missing ILOCK unlock when xfs_setattr_nonsize fails due to EDQUOT
commit 1fb254aa98 upstream.

Benjamin Moody reported to Debian that XFS partially wedges when a chgrp
fails on account of being out of disk quota.  I ran his reproducer
script:

# adduser dummy
# adduser dummy plugdev

# dd if=/dev/zero bs=1M count=100 of=test.img
# mkfs.xfs test.img
# mount -t xfs -o gquota test.img /mnt
# mkdir -p /mnt/dummy
# chown -c dummy /mnt/dummy
# xfs_quota -xc 'limit -g bsoft=100k bhard=100k plugdev' /mnt

(and then as user dummy)

$ dd if=/dev/urandom bs=1M count=50 of=/mnt/dummy/foo
$ chgrp plugdev /mnt/dummy/foo

and saw:

================================================
WARNING: lock held when returning to user space!
5.3.0-rc5 #rc5 Tainted: G        W
------------------------------------------------
chgrp/47006 is leaving the kernel with locks still held!
1 lock held by chgrp/47006:
 #0: 000000006664ea2d (&xfs_nondir_ilock_class){++++}, at: xfs_ilock+0xd2/0x290 [xfs]

...which is clearly caused by xfs_setattr_nonsize failing to unlock the
ILOCK after the xfs_qm_vop_chown_reserve call fails.  Add the missing
unlock.

Reported-by: benjamin.moody@gmail.com
Fixes: 253f4911f2 ("xfs: better xfs_trans_alloc interface")
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Tested-by: Salvatore Bonaccorso <carnil@debian.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-08-29 08:28:57 +02:00
Oleg Nesterov
cf13e30c58 userfaultfd_release: always remove uffd flags and clear vm_userfaultfd_ctx
commit 46d0b24c5e upstream.

userfaultfd_release() should clear vm_flags/vm_userfaultfd_ctx even if
mm->core_state != NULL.

Otherwise a page fault can see userfaultfd_missing() == T and use an
already freed userfaultfd_ctx.

Link: http://lkml.kernel.org/r/20190820160237.GB4983@redhat.com
Fixes: 04f5866e41 ("coredump: fix race condition between mmget_not_zero()/get_task_mm() and core dumping")
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Reported-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Reviewed-by: Andrea Arcangeli <aarcange@redhat.com>
Tested-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Mike Rapoport <rppt@linux.ibm.com>
Cc: Jann Horn <jannh@google.com>
Cc: Jason Gunthorpe <jgg@mellanox.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-08-29 08:28:52 +02:00