linux/drivers
Tejun Heo aff4514876 libata: fix sff host state machine locking while polling
commit 8eee1d3ed5 upstream.

The bulk of ATA host state machine is implemented by
ata_sff_hsm_move().  The function is called from either the interrupt
handler or, if polling, a work item.  Unlike from the interrupt path,
the polling path calls the function without holding the host lock and
ata_sff_hsm_move() selectively grabs the lock.

This is completely broken.  If an IRQ triggers while polling is in
progress, the two can easily race and end up accessing the hardware
and updating state machine state at the same time.  This can put the
state machine in an illegal state and lead to a crash like the
following.

  kernel BUG at drivers/ata/libata-sff.c:1302!
  invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN
  Modules linked in:
  CPU: 1 PID: 10679 Comm: syz-executor Not tainted 4.5.0-rc1+ #300
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
  task: ffff88002bd00000 ti: ffff88002e048000 task.ti: ffff88002e048000
  RIP: 0010:[<ffffffff83a83409>]  [<ffffffff83a83409>] ata_sff_hsm_move+0x619/0x1c60
  ...
  Call Trace:
   <IRQ>
   [<ffffffff83a84c31>] __ata_sff_port_intr+0x1e1/0x3a0 drivers/ata/libata-sff.c:1584
   [<ffffffff83a85611>] ata_bmdma_port_intr+0x71/0x400 drivers/ata/libata-sff.c:2877
   [<     inline     >] __ata_sff_interrupt drivers/ata/libata-sff.c:1629
   [<ffffffff83a85bf3>] ata_bmdma_interrupt+0x253/0x580 drivers/ata/libata-sff.c:2902
   [<ffffffff81479f98>] handle_irq_event_percpu+0x108/0x7e0 kernel/irq/handle.c:157
   [<ffffffff8147a717>] handle_irq_event+0xa7/0x140 kernel/irq/handle.c:205
   [<ffffffff81484573>] handle_edge_irq+0x1e3/0x8d0 kernel/irq/chip.c:623
   [<     inline     >] generic_handle_irq_desc include/linux/irqdesc.h:146
   [<ffffffff811a92bc>] handle_irq+0x10c/0x2a0 arch/x86/kernel/irq_64.c:78
   [<ffffffff811a7e4d>] do_IRQ+0x7d/0x1a0 arch/x86/kernel/irq.c:240
   [<ffffffff86653d4c>] common_interrupt+0x8c/0x8c arch/x86/entry/entry_64.S:520
   <EOI>
   [<     inline     >] rcu_lock_acquire include/linux/rcupdate.h:490
   [<     inline     >] rcu_read_lock include/linux/rcupdate.h:874
   [<ffffffff8164b4a1>] filemap_map_pages+0x131/0xba0 mm/filemap.c:2145
   [<     inline     >] do_fault_around mm/memory.c:2943
   [<     inline     >] do_read_fault mm/memory.c:2962
   [<     inline     >] do_fault mm/memory.c:3133
   [<     inline     >] handle_pte_fault mm/memory.c:3308
   [<     inline     >] __handle_mm_fault mm/memory.c:3418
   [<ffffffff816efb16>] handle_mm_fault+0x2516/0x49a0 mm/memory.c:3447
   [<ffffffff8127dc16>] __do_page_fault+0x376/0x960 arch/x86/mm/fault.c:1238
   [<ffffffff8127e358>] trace_do_page_fault+0xe8/0x420 arch/x86/mm/fault.c:1331
   [<ffffffff8126f514>] do_async_page_fault+0x14/0xd0 arch/x86/kernel/kvm.c:264
   [<ffffffff86655578>] async_page_fault+0x28/0x30 arch/x86/entry/entry_64.S:986

Fix it by ensuring that the polling path is holding the host lock
before entering ata_sff_hsm_move() so that all hardware accesses and
state updates are performed under the host lock.

Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-and-tested-by: Dmitry Vyukov <dvyukov@google.com>
Link: http://lkml.kernel.org/g/CACT4Y+b_JsOxJu2EZyEf+mOXORc_zid5V1-pLZSroJVxyWdSpw@mail.gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-03-03 15:07:27 -08:00
..
accessibility
acpi nfit: fix multi-interface dimm handling, acpi6.1 compatibility 2016-03-03 15:07:24 -08:00
amba
android drivers: android: correct the size of struct binder_uintptr_t for BC_DEAD_BINDER_DONE 2016-03-03 15:07:10 -08:00
ata libata: fix sff host state machine locking while polling 2016-03-03 15:07:27 -08:00
atm
auxdisplay
base base/platform: Fix platform drivers with no probe callback 2016-02-17 12:30:55 -08:00
bcma
block zram: don't call idr_remove() from zram_remove() 2016-02-17 12:31:06 -08:00
bluetooth Bluetooth: Add support of Toshiba Broadcom based devices 2016-03-03 15:07:16 -08:00
bus bus: sunxi-rsb: Fix peripheral IC mapping runtime address 2015-12-22 11:42:30 -08:00
cdrom
char ipmi: move timer init to before irq is setup 2015-12-09 13:13:06 -06:00
clk clk: exynos: use irqsave version of spin_lock to avoid deadlock with irqs 2016-03-03 15:07:17 -08:00
clocksource clockevents/tcb_clksrc: Prevent disabling an already disabled clock 2016-03-03 15:07:15 -08:00
connector connector: bump skb->users before callback invocation 2016-01-04 21:46:45 -05:00
cpufreq cpufreq: Fix NULL reference crash while accessing policy->governor_data 2016-03-03 15:07:25 -08:00
cpuidle
crypto crypto: marvell/cesa - fix test in mv_cesa_dev_dma_init() 2016-02-17 12:31:05 -08:00
dca
devfreq
dio
dma dmaengine: dw: disable BLOCK IRQs for non-cyclic xfer 2016-03-03 15:07:24 -08:00
dma-buf
edac EDAC, mc_sysfs: Fix freeing bus' name 2016-03-03 15:07:17 -08:00
eisa
extcon
firewire IEEE 1394 subsystem patch: 2015-11-11 10:21:34 -08:00
firmware efi: Add pstore variables to the deletion whitelist 2016-03-03 15:07:09 -08:00
fmc
fpga fpga manager: Fix firmware resource leak on error 2015-11-24 15:25:46 -08:00
gpio gpio: revert get() to non-errorprogating behaviour 2015-12-17 15:48:29 +01:00
gpu drm/radeon/pm: adjust display configuration after powerstate 2016-03-03 15:07:23 -08:00
hid HID: multitouch: fix input mode switching on some Elan panels 2016-02-17 12:31:06 -08:00
hsi
hv Drivers: hv: vmbus: Fix a Host signaling bug 2016-03-03 15:07:16 -08:00
hwmon hwmon: (ads1015) Handle negative conversion values correctly 2016-03-03 15:07:25 -08:00
hwspinlock drivers/hwspinlock: fix race between radix tree insertion and lookup 2016-02-25 12:01:23 -08:00
hwtracing coresight: checking for NULL string in coresight_name_match() 2016-03-03 15:07:14 -08:00
i2c i2c: rcar: disable runtime PM correctly in slave mode 2015-12-19 12:00:37 +01:00
ide
idle
iio iio: inkern: fix a NULL dereference on error 2016-02-25 12:01:17 -08:00
infiniband IB/mlx5: Expose correct maximum number of CQE capacity 2016-03-03 15:07:25 -08:00
input Input: vmmouse - fix absolute device registration 2016-02-25 12:01:21 -08:00
iommu iommu/vt-d: Clear PPR bit to ensure we get more page request interrupts 2016-02-25 12:01:22 -08:00
ipack
irqchip irqchip/gic-v3-its: Fix double ICC_EOIR write for LPI in EOImode==1 2016-03-03 15:07:14 -08:00
isdn ser_gigaset: remove unnecessary kfree() calls from release method 2015-12-15 13:24:21 -05:00
leds
lguest
lightnvm lightnvm: wrong offset in bad blk lun calculation 2015-12-29 08:28:32 -07:00
macintosh
mailbox
mcb
md dm: fix dm_rq_target_io leak on faults with .request_fn DM w/ blk-mq paths 2016-03-03 15:07:14 -08:00
media tda1004x: only update the frontend properties if locked 2016-03-03 15:07:14 -08:00
memory fsl-ifc: add missing include on ARM64 2015-12-16 00:16:58 +01:00
memstick
message SCSI queue for 4.4. 2015-11-12 07:06:18 -05:00
mfd
misc cxl: use correct operator when writing pcie config space values 2016-03-03 15:07:17 -08:00
mmc mmc: sdhci: Allow override of get_cd() called from sdhci_request() 2016-03-03 15:07:16 -08:00
mtd mtd: nand: assign reasonable default name for NAND drivers 2016-02-17 12:30:56 -08:00
net rtlwifi: rtl8723be: Fix module parameter initialization 2016-03-03 15:07:13 -08:00
nfc Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2015-11-10 18:11:41 -08:00
ntb
nubus
nvdimm libnvdimm: fix namespace object confusion in is_uuid_busy() 2016-02-25 12:01:21 -08:00
nvme NVMe: IO ending fixes on surprise removal 2015-12-22 10:12:04 -07:00
nvmem
of of/irq: Export of_irq_find_parent again 2015-12-09 09:08:36 -06:00
oprofile
parisc parisc iommu: fix panic due to trying to allocate too large region 2015-12-12 16:07:25 +01:00
parport
pci ACPI / PCI / hotplug: unlock in error path in acpiphp_enable_slot() 2016-03-03 15:07:24 -08:00
pcmcia
perf
phy phy: twl4030-usb: Fix unbalanced pm_runtime_enable on module reload 2016-02-25 12:01:14 -08:00
pinctrl pinctrl: bcm2835: Fix initial value for direction_output 2015-12-14 11:31:20 +01:00
platform ideapad-laptop: Add Lenovo Yoga 700 to no_hw_rfkill dmi list 2016-03-03 15:07:24 -08:00
pnp
power
powercap powercap / RAPL: fix BIOS lock check 2015-12-12 02:31:11 +01:00
pps
ps3
ptp
pwm pwm: Changes for v4.4-rc1 2015-11-11 09:16:10 -08:00
rapidio
ras
regulator regulator: mt6311: MT6311_REGULATOR needs to select REGMAP_I2C 2016-03-03 15:07:17 -08:00
remoteproc remoteproc: fix memory leak of remoteproc ida cache layers 2015-11-26 17:44:28 +02:00
reset
rpmsg
rtc rtc: da9063: fix access ordering error during RTC interrupt at system power on 2015-12-20 13:39:29 +01:00
s390 s390/dasd: fix performance drop 2016-03-03 15:07:12 -08:00
sbus
scsi qla2xxx: Fix stale pointer access. 2016-03-03 15:07:27 -08:00
sfi
sh drivers: sh: Get rid of CONFIG_ARCH_SHMOBILE_MULTI 2015-11-17 02:12:46 +09:00
sn
soc Few Keystone fixes for 4.4-rcx 2015-11-25 23:48:12 +01:00
spi spi: atmel: fix gpio chip-select in case of non-DT platform 2016-03-03 15:07:27 -08:00
spmi
ssb
staging Revert "Staging: panel: usleep_range is preferred over udelay" 2016-03-03 15:07:26 -08:00
target target: Fix race with SCF_SEND_DELAYED_TAS handling 2016-03-03 15:07:27 -08:00
tc
thermal Thermal: do thermal zone update after a cooling device registered 2016-03-03 15:07:25 -08:00
thunderbolt
tty serial: omap: Prevent DoS using unprivileged ioctl(TIOCSRS485) 2016-02-25 12:01:14 -08:00
uio
usb cdc-acm:exclude Samsung phone 04e8:685d 2016-03-03 15:07:26 -08:00
uwb
vfio Revert: "vfio: Include No-IOMMU mode" 2015-12-04 08:38:42 -07:00
vhost vhost: replace % with & on data path 2015-12-07 17:28:10 +02:00
video OMAPDSS: fix timings for VENC to match what omapdrm expects 2015-12-09 12:57:13 +02:00
virt
virtio virtio_pci: fix use after free on release 2016-03-03 15:07:18 -08:00
vlynq
vme
w1
watchdog watchdog: mtk_wdt: Use MODE_KEY when stopping the watchdog 2015-11-23 09:00:09 +01:00
xen xen: bug fixes for 4.4-rc5 2015-12-18 12:24:52 -08:00
zorro
Kconfig
Makefile null_blk: register as a LightNVM device 2015-11-16 15:22:28 -07:00