linux/drivers/misc
Nadav Amit 4e77b2ea94 VMCI: Release resource if the work is already queued
commit ba03a9bbd1 upstream.

Francois reported that VMware balloon gets stuck after a balloon reset,
when the VMCI doorbell is removed. A similar error can occur when the
balloon driver is removed with the following splat:

[ 1088.622000] INFO: task modprobe:3565 blocked for more than 120 seconds.
[ 1088.622035]       Tainted: G        W         5.2.0 #4
[ 1088.622087] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 1088.622205] modprobe        D    0  3565   1450 0x00000000
[ 1088.622210] Call Trace:
[ 1088.622246]  __schedule+0x2a8/0x690
[ 1088.622248]  schedule+0x2d/0x90
[ 1088.622250]  schedule_timeout+0x1d3/0x2f0
[ 1088.622252]  wait_for_completion+0xba/0x140
[ 1088.622320]  ? wake_up_q+0x80/0x80
[ 1088.622370]  vmci_resource_remove+0xb9/0xc0 [vmw_vmci]
[ 1088.622373]  vmci_doorbell_destroy+0x9e/0xd0 [vmw_vmci]
[ 1088.622379]  vmballoon_vmci_cleanup+0x6e/0xf0 [vmw_balloon]
[ 1088.622381]  vmballoon_exit+0x18/0xcc8 [vmw_balloon]
[ 1088.622394]  __x64_sys_delete_module+0x146/0x280
[ 1088.622408]  do_syscall_64+0x5a/0x130
[ 1088.622410]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 1088.622415] RIP: 0033:0x7f54f62791b7
[ 1088.622421] Code: Bad RIP value.
[ 1088.622421] RSP: 002b:00007fff2a949008 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0
[ 1088.622426] RAX: ffffffffffffffda RBX: 000055dff8b55d00 RCX: 00007f54f62791b7
[ 1088.622426] RDX: 0000000000000000 RSI: 0000000000000800 RDI: 000055dff8b55d68
[ 1088.622427] RBP: 000055dff8b55d00 R08: 00007fff2a947fb1 R09: 0000000000000000
[ 1088.622427] R10: 00007f54f62f5cc0 R11: 0000000000000206 R12: 000055dff8b55d68
[ 1088.622428] R13: 0000000000000001 R14: 000055dff8b55d68 R15: 00007fff2a94a3f0

The cause for the bug is that when the "delayed" doorbell is invoked, it
takes a reference on the doorbell entry and schedules work that is
supposed to run the appropriate code and drop the doorbell entry
reference. The code ignores the fact that if the work is already queued,
it will not be scheduled to run one more time. As a result one of the
references would not be dropped. When the code waits for the reference
to get to zero, during balloon reset or module removal, it gets stuck.

Fix it. Drop the reference if schedule_work() indicates that the work is
already queued.

Note that this bug got more apparent (or apparent at all) due to
commit ce664331b2 ("vmw_balloon: VMCI_DOORBELL_SET does not check status").

Fixes: 83e2ec765b ("VMCI: doorbell implementation.")
Reported-by: Francois Rigault <rigault.francois@gmail.com>
Cc: Jorgen Hansen <jhansen@vmware.com>
Cc: Adit Ranadive <aditr@vmware.com>
Cc: Alexios Zavras <alexios.zavras@intel.com>
Cc: Vishnu DASA <vdasa@vmware.com>
Cc: stable@vger.kernel.org
Signed-off-by: Nadav Amit <namit@vmware.com>
Reviewed-by: Vishnu Dasa <vdasa@vmware.com>
Link: https://lore.kernel.org/r/20190820202638.49003-1-namit@vmware.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-09-06 10:22:20 +02:00
..
altera-stapl treewide: kzalloc() -> kcalloc() 2018-06-12 16:19:22 -07:00
c2port kmemcheck: remove annotations 2017-11-15 18:21:04 -08:00
cardreader misc: rtsx: make several functions static 2018-07-03 13:01:48 +02:00
cb710 cb710: Convert to new IDA API 2018-08-21 23:54:18 -04:00
cxl cxl: Wrap iterations over afu slices inside 'afu_list_lock' 2019-03-23 20:10:03 +01:00
echo misc: Remove Blackfin DSP echo support 2018-03-26 15:56:37 +02:00
eeprom eeprom: at24: make spd world-readable again 2019-08-06 19:06:57 +02:00
genwqe genwqe: Prevent an integer overflow in the ioctl 2019-06-11 12:20:54 +02:00
ibmasm ibmasm: don't write out of bounds in read handler 2018-07-07 09:59:35 +02:00
lis3lv02d vfs: do bulk POLL* -> EPOLL* replacement 2018-02-11 14:34:03 -08:00
lkdtm lkdtm: support llvm-objcopy 2019-07-14 08:11:21 +02:00
mei mei: me: add Tiger Lake point LP device ID 2019-09-06 10:22:17 +02:00
mic mic: vop: Fix use-after-free on remove 2019-02-15 08:10:12 +01:00
ocxl ocxl: Fix endiannes bug in read_afu_name() 2019-01-09 17:38:43 +01:00
sgi-gru drivers/misc/sgi-gru: fix Spectre v1 vulnerability 2018-11-27 16:13:10 +01:00
sgi-xp sgi-xp: xpc_partition: mark expected switch fall-throughs 2018-07-07 17:38:57 +02:00
ti-st misc: ti-st: Fix memory leak in the error path of probe() 2018-08-02 10:35:04 +02:00
vmw_vmci VMCI: Release resource if the work is already queued 2019-09-06 10:22:20 +02:00
ad525x_dpot-i2c.c
ad525x_dpot-spi.c
ad525x_dpot.c misc: ad525x_dpot: macros should not use a trailing semicolon 2017-12-18 16:02:26 +01:00
ad525x_dpot.h misc: ad525x_dpot: Unnecessary space before function pointer arguments 2017-12-18 15:59:17 +01:00
apds990x.c misc: apds990x: Missing a blank line after declarations. 2017-12-18 16:02:26 +01:00
apds9802als.c misc: apds9802als: constify i2c_device_id 2017-08-28 16:55:49 +02:00
aspeed-lpc-ctrl.c misc: aspeed-lpc-ctrl: Enable FWH and A2H bridge cycles 2018-03-15 18:20:51 +01:00
aspeed-lpc-snoop.c drivers/misc: Aspeed LPC snoop output using misc chardev 2018-07-16 13:30:47 +02:00
atmel_tclib.c
atmel-ssc.c misc: atmel-ssc: Fix section annotation on atmel_ssc_get_driver_data 2018-11-27 16:13:10 +01:00
bh1770glc.c misc: bh1770glc: constify attribute_group structures. 2017-08-28 16:55:48 +02:00
cs5535-mfgpt.c
ds1682.c misc: ds1682: Ignore update-in-progress ETC reads 2018-01-09 17:03:57 +01:00
dummy-irq.c Annotate hardware config module parameters in drivers/misc/ 2017-04-20 12:02:32 +01:00
enclosure.c misc: enclosure: Remove unnecessary error check 2017-12-07 18:45:31 +01:00
fsa9480.c misc: fsa9480: Add blank line after declarations. 2018-01-09 17:03:57 +01:00
hmc6352.c misc: hmc6352: fix potential Spectre v1 2018-09-12 09:31:00 +02:00
hpilo.c vfs: do bulk POLL* -> EPOLL* replacement 2018-02-11 14:34:03 -08:00
hpilo.h misc: hpilo: Use SPDX-License-Identifier 2017-12-07 18:45:31 +01:00
ibmvmc.c misc: ibmvsm: Fix potential NULL pointer dereference 2019-01-31 08:14:35 +01:00
ibmvmc.h misc: IBM Virtual Management Channel Driver (VMC) 2018-05-14 16:35:42 +02:00
ics932s401.c misc: ics932s401: open brace should be on the previous line 2017-12-18 16:00:57 +01:00
ioc4.c misc: ioc4: constify pci_device_id. 2017-08-28 16:55:48 +02:00
isl29003.c misc: isl29003: Missing a blank line after declarations 2017-12-07 18:45:31 +01:00
isl29020.c misc: isl29020: constify i2c_device_id 2017-08-28 16:55:49 +02:00
Kconfig misc: IBM Virtual Management Channel Driver (VMC) 2018-05-14 16:35:42 +02:00
kgdbts.c Drivers: misc: fix out-of-bounds access in function param_set_kgdbts_var 2019-06-19 08:18:02 +02:00
lattice-ecp3-config.c
Makefile misc: IBM Virtual Management Channel Driver (VMC) 2018-05-14 16:35:42 +02:00
pch_phub.c MISC: add const to bin_attribute structures 2017-08-28 16:55:48 +02:00
pci_endpoint_test.c misc: pci_endpoint_test: Fix test_reg_bar to be updated in pci_endpoint_test 2019-06-15 11:54:06 +02:00
phantom.c vfs: do bulk POLL* -> EPOLL* replacement 2018-02-11 14:34:03 -08:00
pti.c drivers/misc/intel/pti: Rename the header file to free up the namespace 2017-12-17 12:52:34 +01:00
qcom-coincell.c ARM: qcom: silence an uninitialized variable warning 2016-05-01 14:20:04 -07:00
spear13xx_pcie_gadget.c
sram-exec.c misc: sram-exec: Use aligned fncpy instead of memcpy 2017-05-18 17:37:52 +02:00
sram.c misc: sram: enable clock before registering regions 2018-07-06 16:48:15 +02:00
sram.h misc: sram: Integrate protect-exec reserved sram area type 2017-01-25 11:48:03 +01:00
tifm_7xx1.c misc: tifm: Remove VLA 2018-04-23 13:31:27 +02:00
tifm_core.c
tsl2550.c tsl2550: fix lux1_input error in low light 2018-07-07 17:44:52 +02:00
vexpress-syscfg.c misc: vexpress: Off by one in vexpress_syscfg_exec() 2019-02-15 08:10:11 +01:00
vmw_balloon.c Merge 4.18-rc5 into char-misc-next 2018-07-16 09:04:54 +02:00